mesa.git
11 years agost/vdpau: use new vlc function to serach for VC-1 start codes
Christian König [Mon, 9 Sep 2013 09:58:53 +0000 (03:58 -0600)]
st/vdpau: use new vlc function to serach for VC-1 start codes

Signed-off-by: Christian König <christian.koenig@amd.com>
11 years agovl/mpeg12: use new vlc function to search for start codes
Christian König [Mon, 9 Sep 2013 09:57:58 +0000 (03:57 -0600)]
vl/mpeg12: use new vlc function to search for start codes

Signed-off-by: Christian König <christian.koenig@amd.com>
11 years agovl/vlc: add fast forward search for byte value
Christian König [Mon, 9 Sep 2013 09:47:10 +0000 (03:47 -0600)]
vl/vlc: add fast forward search for byte value

Commonly used to find start codes and has far less overhead
to searching manually.

Signed-off-by: Christian König <christian.koenig@amd.com>
11 years agoglsl: Initialize ir_lower_jumps_visitor member variables.
Vinson Lee [Tue, 24 Sep 2013 05:13:37 +0000 (22:13 -0700)]
glsl: Initialize ir_lower_jumps_visitor member variables.

Fixes "Unintialized scalar field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoglsl: Initialize lower_vector_visitor::dont_lower_swz.
Vinson Lee [Tue, 24 Sep 2013 04:47:48 +0000 (21:47 -0700)]
glsl: Initialize lower_vector_visitor::dont_lower_swz.

Fixes "Uninitialized scalar field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoglsl: Initialize assignment_generator member variables.
Vinson Lee [Tue, 24 Sep 2013 05:02:27 +0000 (22:02 -0700)]
glsl: Initialize assignment_generator member variables.

Fixes "Uninitialized pointer field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoglsl: Remove unused pointer value.
Vinson Lee [Tue, 24 Sep 2013 04:41:39 +0000 (21:41 -0700)]
glsl: Remove unused pointer value.

Silences "Unused pointer value" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoRevert "llvmpipe: increase number of subpixel bits to eight"
Zack Rusin [Tue, 24 Sep 2013 19:08:35 +0000 (15:08 -0400)]
Revert "llvmpipe: increase number of subpixel bits to eight"

This reverts commit 755c11dc5e94f17097c186edaaa39d818396f14c.
We agreed that this is band-aid that's not very useful and
the proper solution is to rewrite the rasterization algo
so that it operates on 64 bit values.

Signed-off-by: Zack Rusin <zackr@vmware.com>
11 years agomesa: remove handcounted magic number
Dylan Noblesmith [Fri, 20 Sep 2013 15:55:41 +0000 (11:55 -0400)]
mesa: remove handcounted magic number

Also make it a compile-time error with STATIC_ASSERT.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agomesa: remove outdated comment
Dylan Noblesmith [Fri, 20 Sep 2013 15:55:27 +0000 (11:55 -0400)]
mesa: remove outdated comment

No such argument exists since this commit:

commit 92f3fca0ea429dcf07123e63447449db53308266
Author:     Ian Romanick <ian.d.romanick@intel.com>
AuthorDate: Sun Aug 21 17:23:58 2011 -0700
Commit:     Ian Romanick <ian.d.romanick@intel.com>
CommitDate: Tue Aug 23 14:52:09 2011 -0700

    mesa: Remove target parameter from dd_function_table::BufferSubData

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agomesa: remove stale comment
Dylan Noblesmith [Fri, 20 Sep 2013 15:55:19 +0000 (11:55 -0400)]
mesa: remove stale comment

This line stopped making sense in the great sed
replace of commit f9995b30756140724f41daf963fa06167912be7f

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agollvmpipe: align the array used for subdivived vertices
Zack Rusin [Mon, 23 Sep 2013 21:29:39 +0000 (17:29 -0400)]
llvmpipe: align the array used for subdivived vertices

When subdiving a triangle we're using a temporary array to store
the new coordinates for the subdivided triangles. Unfortunately
the array used for that was not aligned properly causing
random crashes in the llvm jit code which was trying to load
vectors from it.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
11 years agoglapi: Move declaration before code.
Vinson Lee [Mon, 23 Sep 2013 21:07:15 +0000 (14:07 -0700)]
glapi: Move declaration before code.

This patch fixes the MSVC build error introduced by commit
673129e0b936b1c748e988d3f74f3efaab9e5693.

enums.c
mesa\main\enums.c(3776) : error C2143: syntax error : missing ';' before 'type'
mesa\main\enums.c(3781) : error C2065: 'elt' : undeclared identifier
mesa\main\enums.c(3781) : warning C4047: '!=' : 'int' differs in levels of indirection from 'void *'
mesa\main\enums.c(3782) : error C2065: 'elt' : undeclared identifier
mesa\main\enums.c(3782) : error C2223: left of '->offset' must point to struct/union
mesa\main\enums.c(3782) : warning C4033: '_mesa_lookup_enum_by_nr' must return a value

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
11 years agomesa: Use -Bsymbolic in the linker to locally resolve Mesa-internal symbols.
Eric Anholt [Fri, 20 Sep 2013 19:37:04 +0000 (12:37 -0700)]
mesa: Use -Bsymbolic in the linker to locally resolve Mesa-internal symbols.

Normally, LD_PRELOAD will take precedence over your own symbols, which you
want for things like malloc() in libc.  But we don't have any local
symbols we would want overridden (like hash_table_insert(), for example!),
so tell the linker to resolve them internally.  This also avoids calls
through the PLT.

Saves almost 100k on libdricore's size, and gets us a bunch of the
performance back that we had with non-dricore.

Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
11 years agoglsl: Hide many classes local to individual .cpp files in anon namespaces.
Eric Anholt [Fri, 20 Sep 2013 18:03:44 +0000 (11:03 -0700)]
glsl: Hide many classes local to individual .cpp files in anon namespaces.

This gives the compiler the chance to inline and not export class symbols
even in the absence of LTO.  Saves about 60kb on disk.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
11 years agomesa: Drop an extra copy-and-pasted copy in the program clone function.
Eric Anholt [Thu, 19 Sep 2013 23:16:16 +0000 (16:16 -0700)]
mesa: Drop an extra copy-and-pasted copy in the program clone function.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
11 years agomesa: Convert some runtime asserts to static asserts.
Eric Anholt [Fri, 20 Sep 2013 03:06:35 +0000 (20:06 -0700)]
mesa: Convert some runtime asserts to static asserts.

Noticed while grepping through the code for something else.

v2: Don't convert really-runtime asserts to static asserts.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
11 years agomesa: Shrink the size of the enum string lookup struct.
Eric Anholt [Thu, 19 Sep 2013 21:54:13 +0000 (14:54 -0700)]
mesa: Shrink the size of the enum string lookup struct.

Since it's only used for debug information, we can misalign the struct and
save the disk space.  Another 19k on a 64-bit build.

v2: Make a compiler.h macro to only use the attribute if we know we can.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
11 years agomesa: Remove the extra enum strings and extra lookup table.
Eric Anholt [Thu, 19 Sep 2013 19:06:54 +0000 (12:06 -0700)]
mesa: Remove the extra enum strings and extra lookup table.

Now that there's no name -> enum direction, we can drop the extra strings,
and merge the offsets table and the reduced_enums table.

Between the previous commit and this one, Mesa core drops by 30k.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
11 years agomesa: Remove _mesa_lookup_enum_by_name().
Eric Anholt [Thu, 19 Sep 2013 18:48:24 +0000 (11:48 -0700)]
mesa: Remove _mesa_lookup_enum_by_name().

It's been unused for a long time.  I stopped digging through git history
as of 2009.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
11 years agollvmpipe: increase number of subpixel bits to eight
Zack Rusin [Thu, 19 Sep 2013 18:10:08 +0000 (14:10 -0400)]
llvmpipe: increase number of subpixel bits to eight

Unfortunately d3d10 requires a lot higher precision (e.g.
wgf11clipping tests for it). The smallest number of precision
bits with which it passes is 8. That means that we need to
decrease the maximum length of an edge that we can handle without
subdivision by 4 bits. Abstracted the code a bit to make it easier
to change once to switch to 64bit rasterization.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
11 years agoglsl: Define isnormal and copysign for MSVC to fix build.
Vinson Lee [Sun, 22 Sep 2013 23:08:26 +0000 (16:08 -0700)]
glsl: Define isnormal and copysign for MSVC to fix build.

This patch fixes these MSVC build errors.

ir_constant_expression.cpp
src\glsl\ir_constant_expression.cpp(564) : warning C4244: '=' : conversion from 'int' to 'float', possible loss of data
src\glsl\ir_constant_expression.cpp(1384) : error C3861: 'isnormal': identifier not found
src\glsl\ir_constant_expression.cpp(1385) : error C3861: 'copysign': identifier not found

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69541
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Matt Turner <mattst88@gmail.com>
11 years agoSuppress clang's warnings about unused CFLAGS and CXXFLAGS.
Johannes Obermayr [Wed, 11 Sep 2013 22:32:40 +0000 (00:32 +0200)]
Suppress clang's warnings about unused CFLAGS and CXXFLAGS.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoradeon/uvd: async flush the UVD cs
Christian König [Sat, 21 Sep 2013 13:34:38 +0000 (15:34 +0200)]
radeon/uvd: async flush the UVD cs

No need to block for the CS thread here.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
11 years agowinsys/radeon: share winsys between different fd's
Christian König [Sat, 21 Sep 2013 10:25:13 +0000 (12:25 +0200)]
winsys/radeon: share winsys between different fd's

Share the winsys between different fd's if they point to the same device.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
11 years agowinsys/radeon: remove cs_queue_empty
Christian König [Sat, 21 Sep 2013 13:24:55 +0000 (15:24 +0200)]
winsys/radeon: remove cs_queue_empty

Waiting for an empty queue is nonsense and can lead to deadlocks if we have
multiple waiters or another thread that continuously sends down new commands.

Just post the cs to the queue and immediately wait for it to finish.

This is a candidate for the stable branch.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
11 years agowinsys/radeon: fix killing the CS thread
Christian König [Sat, 21 Sep 2013 11:21:47 +0000 (13:21 +0200)]
winsys/radeon: fix killing the CS thread

Kill the thread only after we checked that it's not used any more, not before.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
11 years agoi965/gen4: Fix fragment program rectangle texture shadow compares.
Eric Anholt [Wed, 18 Sep 2013 19:32:31 +0000 (12:32 -0700)]
i965/gen4: Fix fragment program rectangle texture shadow compares.

The rescale_texcoord(), if it does something, will return just the
GLSL-sized coordinate, leaving out the 3rd and 4th components where we
were storing our projected shadow compare and the texture projector.
Deref the shadow compare before using the shared rescale-the-coordinate
code to fix the problem.

Fixes piglit tex-shadow2drect.shader_test and txp-shadow2drect.shader_test

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69525
NOTE: This is a candidate for stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/gen7.5: Fix missing Shader Channel Select entries on Haswell
Abdiel Janulgue [Fri, 20 Sep 2013 10:56:52 +0000 (13:56 +0300)]
i965/gen7.5: Fix missing Shader Channel Select entries on Haswell

Probably non-intentional, but the SURFACE_STATE setup refactoring
for buffer surfaces had missed the scs bits when creating constant
surface states.

Fixes broken GLB 2.5 on Haswell where the knight's textures are missing

Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965, mesa: Use the new DECLARE_R[Z]ALLOC_CXX_OPERATORS macros.
Kenneth Graunke [Wed, 18 Sep 2013 21:11:32 +0000 (14:11 -0700)]
i965, mesa: Use the new DECLARE_R[Z]ALLOC_CXX_OPERATORS macros.

These classes declared a placement new operator, but didn't declare a
delete operator.  Switching to the macro gives them a delete operator,
which probably is a good idea anyway.

This also eliminates a lot of boilerplate.

v2: Properly use RZALLOC in Mesa IR/TGSI translators.  Caught by Eric
    and Chad.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoglsl: Use the new DECLARE_R[Z]ALLOC_CXX_OPERATORS in a bunch of places.
Kenneth Graunke [Wed, 18 Sep 2013 21:05:36 +0000 (14:05 -0700)]
glsl: Use the new DECLARE_R[Z]ALLOC_CXX_OPERATORS in a bunch of places.

This eliminates a lot of boilerplate and should be 100% equivalent.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoralloc: Introduce new macros for defining C++ new/delete operators.
Kenneth Graunke [Wed, 18 Sep 2013 20:56:26 +0000 (13:56 -0700)]
ralloc: Introduce new macros for defining C++ new/delete operators.

Most of our C++ classes define placement new and delete operators so we
can do convenient allocation via:

   thing *foo = new(mem_ctx) thing(...)

Currently, this is done via a lot of boilerplate.  By adding simple
macros to ralloc, we can condense this to a single line, making it
trivial to add this feature to a new class.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agor600g: fast color clears for single-sample buffers
Grigori Goronzy [Tue, 10 Sep 2013 23:41:40 +0000 (01:41 +0200)]
r600g: fast color clears for single-sample buffers

Allocate a CMASK on demand and use it to fast clear single-sample
colorbuffers. Both FBOs and window system colorbuffers are fast
cleared. Expand as needed when colorbuffers are mapped or displayed
on screen.

v2: cosmetics, move transfer expansion into dma_blit

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
11 years agor600g: add support for separately allocated CMASKs
Grigori Goronzy [Tue, 10 Sep 2013 23:41:39 +0000 (01:41 +0200)]
r600g: add support for separately allocated CMASKs

v2: check for NULL cbufs

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
11 years agogallium: add flush_resource context function
Marek Olšák [Fri, 20 Sep 2013 13:08:29 +0000 (15:08 +0200)]
gallium: add flush_resource context function

r600g needs explicit flushing before DRI2 buffers are presented on the screen.

v2: add (stub) implementations for all drivers, fix frontbuffer flushing
v3: fix galahad

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
11 years agoradeonsi: simplify and fix MSAA texture sampling for array textures
Marek Olšák [Wed, 18 Sep 2013 13:40:21 +0000 (15:40 +0200)]
radeonsi: simplify and fix MSAA texture sampling for array textures

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
11 years agoradeonsi: fix textureOffset and texelFetchOffset GLSL functions
Marek Olšák [Wed, 18 Sep 2013 13:36:38 +0000 (15:36 +0200)]
radeonsi: fix textureOffset and texelFetchOffset GLSL functions

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
11 years agollvmpipe: Fix rendering to PIPE_FORMAT_R10G10B10A2_UNORM.
José Fonseca [Fri, 20 Sep 2013 11:58:59 +0000 (12:58 +0100)]
llvmpipe: Fix rendering to PIPE_FORMAT_R10G10B10A2_UNORM.

We must take rounding in consideration when re-scaling to narrow
normalized channels, such as 2-bit normalized alpha.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
11 years agodraw: Ensure draw_pt_middle_end::bind_parameters is never NULL.
José Fonseca [Wed, 18 Sep 2013 19:01:54 +0000 (20:01 +0100)]
draw: Ensure draw_pt_middle_end::bind_parameters is never NULL.

Prevents calling NULL pointer with softpipe in certain cases.

Trivial.

11 years agotools/trace: Simple script to compare two traces.
José Fonseca [Wed, 18 Sep 2013 19:00:50 +0000 (20:00 +0100)]
tools/trace: Simple script to compare two traces.

Based on the earlier apitrace tracediff.sh script.

11 years agomesa: Silence GCC warning 'comparison between signed and unsigned integer expressions'
Ian Romanick [Tue, 10 Sep 2013 16:48:21 +0000 (11:48 -0500)]
mesa: Silence GCC warning 'comparison between signed and unsigned integer expressions'

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agomesa: Fix broken call to print_table_stats
Ian Romanick [Tue, 10 Sep 2013 16:43:56 +0000 (11:43 -0500)]
mesa: Fix broken call to print_table_stats

The function takes a parameter, but none was given.  Also, in the
non-GET_DEBUG case, silence the unused parameter warning.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agoglsl: Set VertexProgram.MaxOutputComponents and FragmentProgram.MaxInputComponents...
Ian Romanick [Tue, 10 Sep 2013 16:34:11 +0000 (11:34 -0500)]
glsl: Set VertexProgram.MaxOutputComponents and FragmentProgram.MaxInputComponents in standalone scaffolding

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agomesa: Allow several ARB_geometry_shader4 queries in OpenGL 3.2
Ian Romanick [Tue, 10 Sep 2013 15:22:47 +0000 (10:22 -0500)]
mesa: Allow several ARB_geometry_shader4 queries in OpenGL 3.2

GL_MAX_GEOMETRY_TEXTURE_IMAGE_UNITS, GL_MAX_GEOMETRY_OUTPUT_VERTICES,
GL_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS, and
GL_MAX_GEOMETRY_UNIFORM_COMPONENTS all have the same enum value and
meaning as their _ARB counterparts.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agomesa: Expose MAX_GEOMETRY_{INPUT,OUTPUT}_COMPONENTS on OpenGL 3.2
Ian Romanick [Tue, 10 Sep 2013 15:18:17 +0000 (10:18 -0500)]
mesa: Expose MAX_GEOMETRY_{INPUT,OUTPUT}_COMPONENTS on OpenGL 3.2

The comment '# GL 3.0 / GLES3' was incorrect.  The
MAX_VERTEX_OUTPUT_COMPONENTS and MAX_FRAGMENT_INPUT_COMPONENTS queries
were added in OpenGL 3.2 (with geometry shaders) and OpenGL ES 3.0.
This just fixes that comment.

v2: Add the GEOMETRY queries in the existing '# GL 3.2' section since
they have nothing to do with GLES3.  Suggested by Paul.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agomesa: Get GL_MAX_FRAGMENT_INPUT_COMPONENTS from FragmentProgram.MaxInputComponents
Ian Romanick [Tue, 10 Sep 2013 15:12:16 +0000 (10:12 -0500)]
mesa: Get GL_MAX_FRAGMENT_INPUT_COMPONENTS from FragmentProgram.MaxInputComponents

In OpenGL ES 3.0 the minimum-maximum for GL_MAX_VERTEX_OUTPUT_VECTORS is 16,
but the minimum-maximum for GL_MAX_FRAGMENT_INTPUT_VECTORS is 15.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agomesa: Get GL_MAX_VERTEX_OUTPUT_COMPONENTS from VertexProgram.MaxOutputComponents
Ian Romanick [Tue, 10 Sep 2013 15:10:07 +0000 (10:10 -0500)]
mesa: Get GL_MAX_VERTEX_OUTPUT_COMPONENTS from VertexProgram.MaxOutputComponents

In OpenGL ES 3.0 the minimum-maximum for GL_MAX_VERTEX_OUTPUT_VECTORS is 16,
but the minimum-maximum for GL_MAX_FRAGMENT_INTPUT_VECTORS is 15.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agoi915: Set VertexProgram.MaxOutputComponents and FragmentProgram.MaxInputComponents
Ian Romanick [Tue, 10 Sep 2013 15:07:10 +0000 (10:07 -0500)]
i915: Set VertexProgram.MaxOutputComponents and FragmentProgram.MaxInputComponents

This was the only remaining place in Mesa that sets MaxVaryings without
also setting these values.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agoi965: Set *Program.Max{Input,Output}Components
Ian Romanick [Wed, 18 Sep 2013 20:29:00 +0000 (15:29 -0500)]
i965: Set *Program.Max{Input,Output}Components

Now that MaxVaryings is > 16, VertexProgram.MaxOutputComponents,
GeometryProgram.MaxInputComponents, GeometryProgram.MaxOutputComponents,
and FragmentProgram.MaxInputComponents also need to be set.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: Paul Berry <stereotype441@gmail.com>
11 years agomesa: Set default values for Max{Input,Output}Components in init_program_limits
Ian Romanick [Tue, 10 Sep 2013 14:58:47 +0000 (09:58 -0500)]
mesa: Set default values for Max{Input,Output}Components in init_program_limits

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agomesa: Remove gl_constants::MaxVaryingComponents
Ian Romanick [Tue, 10 Sep 2013 14:39:38 +0000 (09:39 -0500)]
mesa: Remove gl_constants::MaxVaryingComponents

There are no longer any users.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Cc: Zack Rusin <zackr@vmware.com>
11 years agomesa: Use correct data for MAX_{VERTEX,GEOMETRY}_VARYING_COMPONENTS_ARB queries
Ian Romanick [Tue, 10 Sep 2013 14:35:58 +0000 (09:35 -0500)]
mesa: Use correct data for MAX_{VERTEX,GEOMETRY}_VARYING_COMPONENTS_ARB queries

Previously gl_constants::MaxVaryingComponents was used.  Now
gl_constants::VertexProgram::MaxOutputs and
gl_constants::GeometryProgram::MaxOutputs are used.

This means that st_extensions.c had to be updated to set these fields
instead of MaxVaryingComponents.  It was previously the only place that
set MaxVaryingComponents.

I believe that the structure is allocated by calloc, so the value should
be initialized to zero in non-Gallium drivers before and after my
change.  Right now nobody enables GL_ARB_geometry_shader4, so it's
pretty much dead code anyway.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Cc: Zack Rusin <zackr@vmware.com>
11 years agomesa: Track per-stage shader input and output limits independently
Ian Romanick [Tue, 10 Sep 2013 14:21:56 +0000 (09:21 -0500)]
mesa: Track per-stage shader input and output limits independently

In OpenGL 3.2 these are independently queryable.  In addition, the spec
has different minimum-maximums for various values.
GL_MAX_VERTEX_OUTPUT_COMPONENTS is 64, but
GL_MAX_GEOMETRY_OUTPUT_COMPONENTS (and GL_MAX_FRAGMENT_INPUT_COMPONENTS)
is 128.

In OpenGL ES 3.0 these are also independently queryable.  The spec has
different minimum-maximums for various values.
GL_MAX_VERTEX_OUTPUT_VECTORS is 16, but GL_MAX_FRAGMENT_INTPUT_VECTORS
is 15.

None of these values are used yet.  I have just added space to the
structures.  Future patches will add users and eventually remove some
old fields.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Cc: Zack Rusin <zackr@vmware.com>
11 years agomesa: Support GL_MAX_VERTEX_OUTPUT_COMPONENTS query with ES3
Ian Romanick [Mon, 9 Sep 2013 21:54:11 +0000 (16:54 -0500)]
mesa: Support GL_MAX_VERTEX_OUTPUT_COMPONENTS query with ES3

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org>
11 years agoi965: Refactor Gen4-6 SURFACE_STATE setup for buffer surfaces.
Kenneth Graunke [Sat, 14 Sep 2013 03:12:56 +0000 (20:12 -0700)]
i965: Refactor Gen4-6 SURFACE_STATE setup for buffer surfaces.

This was an embarassingly large amount of copy and pasted code,
and it wasn't particularly simple code either.  By factoring it out
into a helper function, we consolidate the complexity.

v2: Properly NULL-check bo.  Caught by Eric Anholt.
v3: Do the subtraction by 1 in gen7_emit_buffer_surface_state, rather
    than making callers do it.  This makes the buffer_size parameter
    the actual size of the buffer.  Suggested by Paul Berry.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoi965: Refactor Gen7+ SURFACE_STATE setup for buffer surfaces.
Kenneth Graunke [Sat, 14 Sep 2013 00:46:47 +0000 (17:46 -0700)]
i965: Refactor Gen7+ SURFACE_STATE setup for buffer surfaces.

This was an embarassingly large amount of copy and pasted code,
and it wasn't particularly simple code either.  By factoring it out
into a helper function, we consolidate the complexity.

v2: Properly NULL-check bo.  Caught by Eric Anholt.
v3: Do the subtraction by 1 in gen7_emit_buffer_surface_state, rather
    than making callers do it.  This makes the buffer_size parameter
    the actual size of the buffer.  Suggested by Paul Berry.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoi965: Fix off by one errors in texture buffer size calculations.
Kenneth Graunke [Tue, 17 Sep 2013 18:23:59 +0000 (11:23 -0700)]
i965: Fix off by one errors in texture buffer size calculations.

The value that's split into width/height/depth needs to be the size of
the buffer minus one.  This makes it consistent with the constant buffer
and shader time SURFACE_STATE setup code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoi965: Fix writemask != 0 assertions on Sandybridge.
Kenneth Graunke [Tue, 17 Sep 2013 18:54:05 +0000 (11:54 -0700)]
i965: Fix writemask != 0 assertions on Sandybridge.

This fixes myriads of regressions since commit 169f9c030c16d1247a3a7629
("i965: Add an assertion that writemask != NULL for non-ARFs.").

On Sandybridge, our control flow handling (such as brw_IF) does:

   brw_set_dest(p, insn, brw_imm_w(0));
   insn->bits1.branch_gen6.jump_count = 0;

This results in a IMM destination with zero for the writemask.  IMM
destinations are rather bizarre, but the code has been working for ages,
so I'm loathe to change it.

Fixes glxgears on Sandybridge.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoglsl: Delete builtin_builder::shader when destroying built-ins.
Kenneth Graunke [Tue, 17 Sep 2013 06:35:41 +0000 (23:35 -0700)]
glsl: Delete builtin_builder::shader when destroying built-ins.

I would use _mesa_delete_shader, but it's declared static, and we don't
really need any of the stuff in it anyway.

This fixes a memory leak caught by Valgrind.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoi965: Fix brw_gs_prog_data_compare to actually check field members.
Kenneth Graunke [Tue, 17 Sep 2013 06:41:57 +0000 (23:41 -0700)]
i965: Fix brw_gs_prog_data_compare to actually check field members.

&a and &b are the address of the local stack variables, not the actual
structures.  Instead of comparing the fields of a and b, we compared
...some stack memory.

Not a candidate for stable since GS code doesn't exist in 9.2.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Fix brw_vs_prog_data_compare to actually check field members.
Kenneth Graunke [Tue, 17 Sep 2013 05:39:37 +0000 (22:39 -0700)]
i965: Fix brw_vs_prog_data_compare to actually check field members.

&a and &b are the address of the local stack variables, not the actual
structures.  Instead of comparing the fields of a and b, we compared
...some stack memory.

Caught by Valgrind on Piglit's glsl-lod-bias test (among many others).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68233
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: mesa-stable@lists.freedesktop.org
11 years agoi965: Move binding table code to a new file, brw_binding_tables.c.
Kenneth Graunke [Fri, 13 Sep 2013 22:55:03 +0000 (15:55 -0700)]
i965: Move binding table code to a new file, brw_binding_tables.c.

The code to upload the binding tables for each stage was scattered
across brw_{vs,gs,wm}_surface_state.c and brw_misc_state.c, which also
contain a lot of code to populate individual SURFACE_STATE structures.

This patch brings all the binding table upload code together, and splits
it out from the code which fills in SURFACE_STATE entries.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoi965: Use brw_upload_binding_table() for the pixel shader as well.
Kenneth Graunke [Fri, 13 Sep 2013 22:27:04 +0000 (15:27 -0700)]
i965: Use brw_upload_binding_table() for the pixel shader as well.

This is not quite the same: brw_upload_binding_table() also has code to
early-return if there are no entries, while the existing code did not.

The PS binding table is unlikely to be empty since it will have at least
one color buffer.  If it ever is empty, early returning seems wise.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoi965: Generalize brw_vec4_upload_binding_table() beyond vec4 stages.
Kenneth Graunke [Fri, 13 Sep 2013 22:13:49 +0000 (15:13 -0700)]
i965: Generalize brw_vec4_upload_binding_table() beyond vec4 stages.

Instead of passing in a brw_vec4_prog_data structure, we can simply
pass the one field it needs: the number of entries in the binding table.

We also need to pass in the shader time surface index rather than
hardcoding SURF_INDEX_VEC4_SHADER_TIME.

Since the resulting function is stage-agnostic, this patch removes
"vec4_" from the name.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoi965: Convert loop to memcpy in brw_vec4_upload_binding_table().
Kenneth Graunke [Fri, 13 Sep 2013 21:51:10 +0000 (14:51 -0700)]
i965: Convert loop to memcpy in brw_vec4_upload_binding_table().

This is probably more efficient.  At any rate, it's less code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoi965: Update comments in brw_vec4_upload_binding_table().
Kenneth Graunke [Fri, 13 Sep 2013 21:45:34 +0000 (14:45 -0700)]
i965: Update comments in brw_vec4_upload_binding_table().

The first comment was a bit stale; there are more kinds of surfaces than
textures and pull constants.

The second was a leftover "to do" comment for something I already did.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agowinsys/sw/xlib: fix compile error in xlib_sw_winsys.c.
Gaetan Nadon [Tue, 17 Sep 2013 19:46:10 +0000 (15:46 -0400)]
winsys/sw/xlib: fix compile error in xlib_sw_winsys.c.

xlib_sw_winsys.h:5:22: fatal error: X11/Xlib.h: No such file or directory

The compiler cannot find the Xlib.h in the installed system headers.
All supplied include directives point to inside the mesa module.
The X11_CFLAGS variable is undefined (not defined in config.status).

It appears the intent was to use X11_INCLUDES defined in configure.ac.

The Xlib.h file is not installed on my workstation. It is supplied in
the libx11-dev package. This allows an X developer control over which
version of this file is used for X development.

Signed-off-by: Gaetan Nadon <memsize@videotron.ca>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoglx: fix compile error in egl_glx.c.
Gaetan Nadon [Tue, 17 Sep 2013 19:46:09 +0000 (15:46 -0400)]
glx: fix compile error in egl_glx.c.

egl_glx.c:40:22: fatal error: X11/Xlib.h: No such file or directory

The compiler cannot find the Xlib.h in the installed system headers.
All supplied include directives point to inside the mesa module.
The X11_CFLAGS variable is undefined (not defined in config.status).

It appears the intent was to use X11_INCLUDES defined in configure.ac.

The Xlib.h file is not installed on my workstation. It is supplied in
the libx11-dev package. This allows an X developer control over which
version of this file is used for X development.

Signed-off-by: Gaetan Nadon <memsize@videotron.ca>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agofreedreno/a3xx: fix typo mixup w/ mipfilter
Rob Clark [Thu, 19 Sep 2013 14:28:52 +0000 (10:28 -0400)]
freedreno/a3xx: fix typo mixup w/ mipfilter

Signed-off-by: Rob Clark <robclark@freedesktop.org>
11 years agofreedreno: fix glReadPixels
Rob Clark [Thu, 19 Sep 2013 14:08:38 +0000 (10:08 -0400)]
freedreno: fix glReadPixels

duh, we still need to flush if there are pending draws and it isn't an
unsynchronized case.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
11 years agogallivm: adjust wrap mode to CLAMP_TO_EDGE always for cube maps.
Roland Scheidegger [Thu, 19 Sep 2013 15:13:18 +0000 (17:13 +0200)]
gallivm: adjust wrap mode to CLAMP_TO_EDGE always for cube maps.

Technically without seamless filtering enabled GL allows any wrap mode, which
made sense when supporting true borders (can get seamless effect with border
and CLAMP_TO_BORDER), but gallium doesn't support borders and d3d9 requires
wrap modes to be ignored and it's a pain to fix up the sampler state (as it
makes it texture dependent). It is difficult to imagine a situation where an
app really wants another behavior so just cheat here. (It looks like some
graphics hw (intel) actually requires this too hence it should be safe.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
11 years agoandroid: Remove builtin_compiler
Adrian Negreanu [Fri, 13 Sep 2013 08:58:33 +0000 (11:58 +0300)]
android: Remove builtin_compiler

The first part was done in:

   commit c845140a20efa6a30a5465301d1f9b4acea79155
   Author: Kenneth Graunke <kenneth@whitecape.org>
   Date:   Tue Sep 3 21:22:17 2013 -0700

Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoutil/u_blit: Implement util_blit_pixels via pipe_context::blit.
José Fonseca [Thu, 12 Sep 2013 13:20:02 +0000 (14:20 +0100)]
util/u_blit: Implement util_blit_pixels via pipe_context::blit.

This removes a lot of code, but not everything, as util_blit_pixels_tex
is still useful when one needs to override pipe_sampler_view::swizzle_?.

Reviewed-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
11 years agoutil/u_blit: Support blits from cubemaps.
José Fonseca [Tue, 17 Sep 2013 18:22:44 +0000 (19:22 +0100)]
util/u_blit: Support blits from cubemaps.

By calling util_map_texcoords2d_onto_cubemap.

A new parameter for util_blit_pixels_tex is necessary, as
pipe_sampler_view::first_layer is always supposed to point to the first
face when sampling from cubemaps.

Reviewed-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
11 years agovega: Use pipe_context::blit instead of util_blit_pixels_tex.
José Fonseca [Tue, 17 Sep 2013 18:01:11 +0000 (19:01 +0100)]
vega: Use pipe_context::blit instead of util_blit_pixels_tex.

Only compile-tested but it seems straightforward.

Reviewed-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
11 years agoi965: Rename brw_{fs,vec4}_emit.cpp to brw_{fs,vec4}_generator.cpp.
Kenneth Graunke [Wed, 18 Sep 2013 06:32:10 +0000 (23:32 -0700)]
i965: Rename brw_{fs,vec4}_emit.cpp to brw_{fs,vec4}_generator.cpp.

The previous names were really confusing to talk about:
- brw_fs_visitor() contained methods named emit_whatever().
- brw_fs_generator() contained methods named generate_whatever(), but
  lived in brw_fs_emit.cpp.

So when someone said "the emit layer", or "emit code", we weren't sure
whether they meant the visitor's emit() functions or the generator in
brw_fs_emit.cpp.

By renaming these files, the method names, class names, and file names
all match, which is much less confusing.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Eric Anholt <eric@anholt.net>
11 years agoglsl: Correctly validate fma()'s types.
Matt Turner [Fri, 6 Sep 2013 22:05:10 +0000 (15:05 -0700)]
glsl: Correctly validate fma()'s types.

lrp() can take a scalar as a third argument, and fma() cannot.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoglsl: Add frexp signatures and implementation.
Matt Turner [Mon, 9 Sep 2013 18:13:20 +0000 (11:13 -0700)]
glsl: Add frexp signatures and implementation.

I initially implemented frexp() as an IR opcode with a lowering pass,
but since it returns a value and has an out-parameter, it would break
assumptions our optimization passes make about ir_expressions being pure
(i.e., having no side effects).

For example, if opt_tree_grafting encounters this code:

uniform float u;
void main()
{
  int exp;
  float f = frexp(u, out exp);
  float g = float(exp)/256.0;
  float h = float(exp) + 1.0;
  gl_FragColor = vec4(f, g, h, g + h);
}

it may try to optimize it to this:

uniform float u;
void main()
{
  int exp;
  float g = float(exp)/256.0;
  float h = float(exp) + 1.0;
  gl_FragColor = vec4(frexp(u, out exp), g, h, g + h);
}

Some hardware has an instruction which performs frexp(), but we would
need some other compiler infrastructure to be able to generate it, such
as an intrinsics system that would allow backends to emit specific code
for particular bits of IR.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agoi965: Lower ldexp.
Matt Turner [Sat, 3 Aug 2013 18:34:30 +0000 (11:34 -0700)]
i965: Lower ldexp.

v2: Drop frexp lowering.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agoglsl: Add ldexp_to_arith lowering pass.
Matt Turner [Sat, 3 Aug 2013 18:02:59 +0000 (11:02 -0700)]
glsl: Add ldexp_to_arith lowering pass.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agoglsl: Allow vectors to be created from ir_constant().
Matt Turner [Mon, 5 Aug 2013 22:15:37 +0000 (15:15 -0700)]
glsl: Allow vectors to be created from ir_constant().

Note the parameter name change in the int version of ir_constant, to
avoid the conflict with the loop iterator.

v2: Make analogous change to builtin_builder::imm().
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agoglsl: Add support for ldexp.
Matt Turner [Thu, 22 Aug 2013 20:31:18 +0000 (13:31 -0700)]
glsl: Add support for ldexp.

v2: Drop frexp. Rebase on builtins rewrite.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agoi965: Add some missing bits to {mesa,brw,cache}_bits[].
Paul Berry [Mon, 2 Sep 2013 00:52:20 +0000 (17:52 -0700)]
i965: Add some missing bits to {mesa,brw,cache}_bits[].

These data structures are used for debug output, so it wasn't hurting
anything that there were missing bits.  But it's good to keep things
up to date.

This patch also adds static asserts so that the {brw,cache}_bits[]
arrays are the proper size, so that we don't forget to add to them in
the future.  Unfortunately there's no convenient way to assert that
mesa_bits[] is the proper size.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/gs: Implement basic gl_PrimitiveIDIn functionality.
Paul Berry [Mon, 12 Aug 2013 15:00:10 +0000 (08:00 -0700)]
i965/gs: Implement basic gl_PrimitiveIDIn functionality.

If the geometry shader refers to the built-in variable
gl_PrimitiveIDIn, we need to set a bit in 3DSTATE_GS to tell the
hardware to dispatch primitive ID to r1, and we need to leave room for
it when allocating registers.

Note: this feature doesn't yet work properly when software primitive
restart is in use (the primitive ID counter will incorrectly reset
with each primitive restart, since software primitive restart works by
performing multiple draw calls).  I plan to address that in a future
patch series.

Fixes piglit test "spec/glsl-1.50/execution/geometry/primitive-id-in".

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/gs: New gs primitive types are supported by HW primitive restart.
Paul Berry [Tue, 27 Aug 2013 04:20:12 +0000 (21:20 -0700)]
i965/gs: New gs primitive types are supported by HW primitive restart.

When we previously implemented primitive restart, we didn't add cases
to brw_primitive_restart.c's can_cut_index_handle_prims() for the
primitive types that are introduced with geometry shaders.  It turns
out that all of the new primitive types are supported by hardware
primitive restart.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/gs: Add new primitive types.
Paul Berry [Sun, 28 Apr 2013 14:43:18 +0000 (07:43 -0700)]
i965/gs: Add new primitive types.

As part of its support for geometry shaders, GL 3.2 introduces four
new primitive types: GL_LINES_ADJACENCY, GL_LINE_STRIP_ADJACENCY,
GL_TRIANGLES_ADJACENCY, and GL_TRIANGLE_STRIP_ADJACENCY.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agogallivm: some bits of seamless cube filtering implementation
Roland Scheidegger [Fri, 13 Sep 2013 17:52:09 +0000 (19:52 +0200)]
gallivm: some bits of seamless cube filtering implementation

Simply adjust wrap mode to clamp_to_edge. This is all that's needed for a
correct implementation for nearest filtering, and it's way better than
using repeat wrap for instance for linear filtering (though obviously this
doesn't actually do seamless filtering).

v2: fix s/t wrap not r/s...

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
11 years agoi965: Remove MIPLAYOUT_BELOW from Gen4-6 constant buffer surface state.
Kenneth Graunke [Sat, 14 Sep 2013 03:01:08 +0000 (20:01 -0700)]
i965: Remove MIPLAYOUT_BELOW from Gen4-6 constant buffer surface state.

Specifying a miptree layout makes no sense for constant buffers.

This has no functional change since BRW_SURFACE_MIPMAPLAYOUT_BELOW is
just a #define for 0.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agoegl: Also add EGL_TEXTURE_FORMAT as a valid eglQueryWaylandBufferWL attribute
Kristian Høgsberg [Tue, 17 Sep 2013 05:22:49 +0000 (22:22 -0700)]
egl: Also add EGL_TEXTURE_FORMAT as a valid eglQueryWaylandBufferWL attribute

Now that we have a table of accepted eglQueryWaylandBufferWL() attributes,
we should also list EGL_TEXTURE_FORMAT.

11 years agoegl: add EGL_WAYLAND_Y_INVERTED_WL attribute
Stanislav Vorobiov [Mon, 16 Sep 2013 09:02:46 +0000 (13:02 +0400)]
egl: add EGL_WAYLAND_Y_INVERTED_WL attribute

This enables querying of wl_buffer's orientation

11 years agoi965: Use gen7_upload_constant_state for 3DSTATE_CONSTANT_PS as well.
Kenneth Graunke [Fri, 13 Sep 2013 21:41:04 +0000 (14:41 -0700)]
i965: Use gen7_upload_constant_state for 3DSTATE_CONSTANT_PS as well.

Now we use gen7_upload_constant_state() for all three shader stages.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agoi965: Set brw_stage_state::push_const_size for PS constants.
Kenneth Graunke [Fri, 13 Sep 2013 21:37:09 +0000 (14:37 -0700)]
i965: Set brw_stage_state::push_const_size for PS constants.

This paves the way for using gen7_upload_constant_state for PS data.

The formula is copied from gen7_wm_state.c.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agoi965: Introduce a prog_data temporary in gen6_upload_wm_push_constants.
Kenneth Graunke [Fri, 13 Sep 2013 21:34:48 +0000 (14:34 -0700)]
i965: Introduce a prog_data temporary in gen6_upload_wm_push_constants.

This saves a bit of typing and shortens a few lines.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agoi965/gen6+: Support 128 varying components.
Paul Berry [Tue, 3 Sep 2013 19:37:47 +0000 (12:37 -0700)]
i965/gen6+: Support 128 varying components.

GL 3.2 requires us to support 128 varying components for geometry
shader outputs and fragment shader inputs, and 64 varying components
otherwise.  But there's no hardware limitation that restricts us to 64
varying components, and core Mesa doesn't currently allow different
stages to have different maximum values, so just go ahead and enable
128 varying components for all stages.  This gets us better test
coverage anyway.

Even though we are only working on GL 3.2 support for gen7 right now,
gen6 also supports 128 varying components, so go ahead and switch it
on there too.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/ff_gs: Generate URB writes using a loop.
Paul Berry [Tue, 3 Sep 2013 21:38:19 +0000 (14:38 -0700)]
i965/ff_gs: Generate URB writes using a loop.

Previously we only ever did 1 URB write, since the maximum number of
varyings we support is small enough to fit in 1 URB write (when using
BRW_URB_SWIZZLE_NONE, which is what the pre-Gen7 GS always uses).  But
we're about to increase the number of varying components we support
from 64 to 128.

With 128 varyings, the most URB writes we'll have to do is 2, but it's
just as easy to write a general-purpose loop.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/gen6: Fix assertions on VS/GS URB size.
Paul Berry [Tue, 3 Sep 2013 21:19:18 +0000 (14:19 -0700)]
i965/gen6: Fix assertions on VS/GS URB size.

The "{VS,GS} URB Entry Allocation Size" fields of 3DSTATE_URB allow
values in the range 0-4, but they are U8-1 fields, so the range of
possible allocation sizes is 1-5.  We were erroneously prohibiting a
size of 5.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/vec4: Generate URB writes using a loop.
Paul Berry [Tue, 3 Sep 2013 19:30:06 +0000 (12:30 -0700)]
i965/vec4: Generate URB writes using a loop.

Previously we only ever did 1 or 2 URB writes, since the maximum
number of varyings we support is small enough to fit in 2 URB writes.
But GL 3.2 requires the geometry shader to support 128 output varying
components, and this could require up to 3 URB writes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/fs: When >64 input components, order them to match prev pipeline stage.
Paul Berry [Tue, 3 Sep 2013 19:15:53 +0000 (12:15 -0700)]
i965/fs: When >64 input components, order them to match prev pipeline stage.

Since the SF/SBE stage is only capable of performing arbitrary
reorderings of 16 varying slots, we can't arrange the fragment shader
inputs in an arbitrary order if there are more than 16 input varying
slots in use.  We need to make sure that slots 16-31 match the
corresponding outputs of the previous pipeline stage.

The easiest way to accomplish this is to just make all varying slots
match up with the previous pipeline stage.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/fs: Simplify computation of key.input_slots_valid during precompile.
Paul Berry [Tue, 3 Sep 2013 18:55:17 +0000 (11:55 -0700)]
i965/fs: Simplify computation of key.input_slots_valid during precompile.

The for loop was rather silly.  In addition to checking brw->gen < 6
on each loop iteration, it took pains to exclude bits from
fp->Base.InputsRead that don't correspond to fragment shader inputs.
But those bits would never have been set in the first place, since the
only bits that are ever set in fp->Base.InputsRead are fragment shader
inputs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/gs: Stop storing an input VUE map in the GS program key.
Paul Berry [Mon, 2 Sep 2013 21:02:22 +0000 (14:02 -0700)]
i965/gs: Stop storing an input VUE map in the GS program key.

Now that the vertex shader output VUE map is determined solely by a
64-bit bitfield, we don't have to store it in its entirety in the
geometry shader program key; instead, we can just store the bitfield,
and let the geometry shader infer the VUE map at compile time.

This dramatically reduces the size of the geometry shader program key,
which we want to keep small since it gets recomputed whenever the
active program changes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>