mesa.git
11 years agoglsl/linker: Sort varyings by packing class, then vector size.
Paul Berry [Wed, 5 Dec 2012 18:19:19 +0000 (10:19 -0800)]
glsl/linker: Sort varyings by packing class, then vector size.

This patch paves the way for varying packing by adding a sorting step
before varying assignment, which sorts the varyings into an order that
increases the likelihood of being able to find an efficient packing.

First, varyings are sorted into "packing classes" by considering
attributes that can't be mixed during varying packing--at the moment
this includes base type (float/int/uint/bool) and interpolation mode
(smooth/noperspective/flat/centroid), though later we will hopefully
be able to relax some of these restrictions.  The number of packing
classes places an upper limit on the amount of space that must be
wasted by varying packing, since in theory a shader might nave 4n+1
components worth of varyings in each of m packing classes, resulting
in 3m components worth of wasted space.

Then, within each packing class, varyings are sorted by vector size,
with vec4's coming first, then vec2's, then scalars, and then finally
vec3's.  The motivation for this order is that it ensures that the
only vectors that might be "double parked" (with part of the vector in
one varying slot and the remainder in another) are vec3's.

Note that the varyings aren't actually packed yet, merely placed in an
order that will facilitate packing.

Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoglsl/linker: Subdivide the first phase of varying assignment.
Paul Berry [Tue, 4 Dec 2012 23:55:59 +0000 (15:55 -0800)]
glsl/linker: Subdivide the first phase of varying assignment.

This patch further subdivides the loop that assigns varying locations
into two phases: one phase to match up the varyings between shader
stages, and one phase to assign them varying locations.

In between the two phases the matched varyings are stored in a new
data structure called varying_matches.  This will free us to be able
to assign varying locations in any order, which will pave the way for
packing varyings.

Note that the new varying_matches::assign_locations() function returns
the number of varying slots that were used; this return value will be
used in a future patch.

Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoglsl/linker: Defer recording transform feedback locations.
Paul Berry [Tue, 4 Dec 2012 18:34:45 +0000 (10:34 -0800)]
glsl/linker: Defer recording transform feedback locations.

This patch subdivides the loop that assigns varying locations into two
phases: one phase to match up varyings between shader stages (and
assign them varying locations), and a second phase to record the
varying assignments for use by transform feedback.

This paves the way for varying packing, which will require us to
further subdivide the first phase.

In addition, it lets us avoid a clumsy O(n^2) algorithm, since we can
now record the locations of all transform feedback varyings in a
single pass through the tfeedback_decls array, rather than have to
iterate through the array after assigning each varying.

Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoglsl: Create a field to store fractional varying locations.
Paul Berry [Wed, 5 Dec 2012 18:47:55 +0000 (10:47 -0800)]
glsl: Create a field to store fractional varying locations.

Currently, the location of each varying is recorded in ir_variable as
a multiple of the size of a vec4.  In order to pack varyings, we need
to be able to record, e.g. that a vec2 is stored in the second half of
a varying slot rather than the first half.

This patch introduces a field ir_variable::location_frac, which
represents the offset within a vec4 where a varying's value is stored.
Varyings that are not subject to packing will always have a
location_frac value of zero.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoglsl/linker: Make separate ir_variable field to mean "unmatched".
Paul Berry [Tue, 4 Dec 2012 23:17:01 +0000 (15:17 -0800)]
glsl/linker: Make separate ir_variable field to mean "unmatched".

Previously, the linker used a value of -1 in ir_variable::location to
denote a generic input or output of the shader that had not yet been
matched up to a variable in another pipeline stage.

This patch introduces a new ir_variable field,
is_unmatched_generic_inout, for that purpose.

In future patches, this will allow us to separate the process of
matching varyings between shader stages from the processes of
assigning locations to those varying.  That will in turn pave the way
for packing varyings.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoglsl/linker: Always invalidate shader ins/outs, even in corner cases.
Paul Berry [Wed, 5 Dec 2012 15:17:07 +0000 (07:17 -0800)]
glsl/linker: Always invalidate shader ins/outs, even in corner cases.

Previously, link_invalidate_variable_locations() was only called
during assign_attribute_or_color_locations() and
assign_varying_locations().  This meant that in the corner case when
there was only a vertex shader, and varyings were being captured by
transform feedback, link_invalidate_variable_locations() wasn't being
called for the varyings.

This patch migrates the calls to link_invalidate_variable_locations()
to link_shaders(), so that they will be called in all circumstances.
In addition, it modifies the call semantics so that
link_invalidate_variable_locations() need only be called once per
shader stage (rather than once for inputs and once for outputs).

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoglsl/lower_clip_distance: Update symbol table.
Paul Berry [Tue, 4 Dec 2012 19:11:02 +0000 (11:11 -0800)]
glsl/lower_clip_distance: Update symbol table.

This patch modifies the clip distance lowering pass so that the new
symbol it generates (glClipDistanceMESA) is added to the shader's
symbol table.

This will allow a later patch to modify the linker so that it finds
transform feedback varyings using the symbol table rather than having
to iterate through all the declarations in the shader.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoandroid: build fix for libmesa_glsl_utils
Tapani Pälli [Thu, 13 Dec 2012 08:56:08 +0000 (10:56 +0200)]
android: build fix for libmesa_glsl_utils

hash_table.c compilation requires ralloc.h include path

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agomesa: minor indentation fixes in texcompress_etc.c
Brian Paul [Sat, 8 Dec 2012 22:19:44 +0000 (15:19 -0700)]
mesa: minor indentation fixes in texcompress_etc.c

11 years agomesa: remove old swrast-based compressed texel fetch code
Brian Paul [Sat, 8 Dec 2012 22:19:44 +0000 (15:19 -0700)]
mesa: remove old swrast-based compressed texel fetch code

11 years agoswrast: use new core Mesa compressed texel fetch functions
Brian Paul [Sat, 8 Dec 2012 22:19:44 +0000 (15:19 -0700)]
swrast: use new core Mesa compressed texel fetch functions

11 years agomesa: reimplement _mesa_decompress_image() using new tex fetch code
Brian Paul [Sat, 8 Dec 2012 22:19:44 +0000 (15:19 -0700)]
mesa: reimplement _mesa_decompress_image() using new tex fetch code

11 years agomesa: added _mesa_get_compressed_fetch_func()
Brian Paul [Sat, 8 Dec 2012 22:19:44 +0000 (15:19 -0700)]
mesa: added _mesa_get_compressed_fetch_func()

11 years agomesa: add new texel fetch code for etc formats
Brian Paul [Sat, 8 Dec 2012 22:19:44 +0000 (15:19 -0700)]
mesa: add new texel fetch code for etc formats

11 years agomesa: add new texel fetch code for rgtc formats
Brian Paul [Sat, 8 Dec 2012 22:19:44 +0000 (15:19 -0700)]
mesa: add new texel fetch code for rgtc formats

11 years agomesa: add new texel fetch code for fxt formats
Brian Paul [Sat, 8 Dec 2012 22:19:44 +0000 (15:19 -0700)]
mesa: add new texel fetch code for fxt formats

11 years agomesa: add new texel fetch code for dxt formats
Brian Paul [Sat, 8 Dec 2012 22:19:44 +0000 (15:19 -0700)]
mesa: add new texel fetch code for dxt formats

11 years agomesa: add compressed_fetch_func typedef
Brian Paul [Sat, 8 Dec 2012 22:19:44 +0000 (15:19 -0700)]
mesa: add compressed_fetch_func typedef

This is a first step in removing the swrast-related code in core
Mesa's texture compression files.

11 years agoswrast: merge get_texel_fetch_func() and set_fetch_functions()
Brian Paul [Sat, 8 Dec 2012 22:19:44 +0000 (15:19 -0700)]
swrast: merge get_texel_fetch_func() and set_fetch_functions()

No real need for separate functions anymore.

11 years agoswrast: make _mesa_get_texel_fetch_func() static
Brian Paul [Sat, 8 Dec 2012 22:19:44 +0000 (15:19 -0700)]
swrast: make _mesa_get_texel_fetch_func() static

Not called from any other file.

11 years agodraw/llvmpipe: fix transform feedback position + enable other extensions
Dave Airlie [Thu, 13 Dec 2012 10:17:58 +0000 (20:17 +1000)]
draw/llvmpipe: fix transform feedback position + enable other extensions

This builds on the previous draw/softpipe patch.

So llvmpipe does streamout calls after clip/viewport stages,
but we have the pre-clip position stored for later use, so
when we are doing transform feedback, and its the position vertex
grab the vertex from the stored pre clip position.

The perfect fix is too probably add a codegen transform feedback
stage in between shader and clip stages, but this is good enough
for now.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
11 years agodraw: add support for later transform feedback extensions
Dave Airlie [Wed, 12 Dec 2012 11:14:58 +0000 (21:14 +1000)]
draw: add support for later transform feedback extensions

This adds support to draw for the new features of transform feedback.

a) fix count_from_stream_output, using max_index+1 for now but it looks
like it should be valid as its derived from the vertex elements/vbo.

b) fix striding and dst offsets in output buffers - was just wrong before.

c) fix crash if tfb is suspended (so.num_targets == 0)

This also enables the new features on softpipe. It should be possible
to enable them on llvmpipe as well after this commit, but would need
to schedule piglit runs.

Signed-off-by: Dave Airlie <airlied@redhat.com>
11 years agoclover: Fix build since removal of pipe_surface::usage
Tom Stellard [Thu, 13 Dec 2012 20:04:34 +0000 (20:04 +0000)]
clover: Fix build since removal of pipe_surface::usage

by commit 25409c6da8163d9acb386511aef0c11577c7aadb

11 years agor600g/radeonsi: Silence warnings
Maxence Le Dore [Thu, 13 Dec 2012 04:17:35 +0000 (05:17 +0100)]
r600g/radeonsi: Silence warnings

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
11 years agoclover: Add support for compiler flags
Tom Stellard [Tue, 27 Nov 2012 21:57:15 +0000 (21:57 +0000)]
clover: Add support for compiler flags

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
11 years agoclover: Don't erase build info of devices not being built
Tom Stellard [Mon, 10 Dec 2012 16:04:04 +0000 (16:04 +0000)]
clover: Don't erase build info of devices not being built

Every call to _cl_program::build() was erasing the binaries and logs for
every device associated with the program.  This is incorrect because
it is possible to build a program for only a subset of devices and so
any device not being build should not have this information erased.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
11 years agor600g: use load_ar checks with llvm output.
Vincent Lejeune [Tue, 6 Nov 2012 15:18:06 +0000 (16:18 +0100)]
r600g: use load_ar checks with llvm output.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
11 years agobuild: Fix AX_PROG_{CC,CXX}_FOR_BUILD macros
Thierry Reding [Tue, 20 Nov 2012 15:50:35 +0000 (16:50 +0100)]
build: Fix AX_PROG_{CC,CXX}_FOR_BUILD macros

Override the cross_compiling and ac_tool_prefix variables by reassigning
to them instead of redefining the macros. Redefining them will actually
cause the variable names to be replaced instead of their content.

Furthermore push the definition of CPPFLAGS before running the checks
for the build tools to avoid the host CPPFLAGS from leaking into the
build CPPFLAGS.

While at it drop the redefinition of AC_TRY_COMPILER which hasn't been
used since autoconf 2.50 and make sure that all definitions are properly
popped when done (LDFLAGS, ac_cv_prog_CPP, ac_cv_prog_CXXCPP).

Acked-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
11 years agogallivm: fix texel fetch for array textures
Roland Scheidegger [Thu, 13 Dec 2012 18:12:58 +0000 (19:12 +0100)]
gallivm: fix texel fetch for array textures

Since we don't call lp_build_sample_common() in the texel fetch path we missed
the layer fixup code. If someone would have tried to do texelFetch with array
textures it would have crashed for sure.
Not really tested (can't run the piglit test being able to use texelFetch with
array samplers for now with llvmpipe).

Reviewed-by: José Fonseca <jfonseca@vmware.com>
11 years agomesa: Fix computation of default vertex attrib stride for 2_10_10_10 formats.
Paul Berry [Wed, 12 Dec 2012 03:14:33 +0000 (19:14 -0800)]
mesa: Fix computation of default vertex attrib stride for 2_10_10_10 formats.

Previously, if the client program didn't specify a stride when setting
up a vertex attribute, we used _mesa_sizeof_type() to compute the size
of the type, and multiplied it by the number of components.

This didn't work for the 2_10_10_10 formats, since _mesa_sizeof_type()
returns -1 for those types, resulting in all kinds of havoc, since it
was causing the hardware to be programmed with a negative stride
value.

This patch adds a new function _mesa_bytes_per_vertex_attrib(), which
is similar to the existing function _mesa_bytes_per_pixel(), but which
computes the size of a vertex attribute based on the type and the
number of formats.  For packed formats (currently only the 2_10_10_10
formats), it verifies that the number of components is correct and
returns the size of the packed format.  For unpacked formats, it
returns the size of the type times the number of components.

In addition, this patch adds an assertion so that if we ever forget to
update _mesa_bytes_per_vertex_attrib() when adding a new vertex
format, we'll see the problem quickly rather than having to debug a
subtle conformance test failure.

Fixes GLES3 conformance tests
vertex_type_2_10_10_10_rev_{conversion,divisor,stride_pointer}.test.

Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agomesa/uniform_query: Don't write to *params if there is an error
Matt Turner [Sat, 8 Dec 2012 00:32:30 +0000 (16:32 -0800)]
mesa/uniform_query: Don't write to *params if there is an error

The GL 3.1 and ES 3.0 specs say of glGetActiveUniformsiv:
   "If an error occurs, nothing will be written to params."

So, make a pass through the indices and check that they're valid before
the pass that actually writes to params. Checking pname happens on the
first iteration of the second loop.

Fixes es3conform's getactiveuniformsiv_for_nonexistent_uniform_indices
test.

NOTE: This is a candidate for the 9.0 branch.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agomesa: print unsigned values with %u
Matt Turner [Fri, 7 Dec 2012 22:26:04 +0000 (14:26 -0800)]
mesa: print unsigned values with %u

Otherwise messages say silly things like
   glGetActiveUniformBlockiv(block index -1 >= 0)

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Fix disassembly of jump targets on Gen7.
Kenneth Graunke [Wed, 12 Dec 2012 10:37:58 +0000 (02:37 -0800)]
i965: Fix disassembly of jump targets on Gen7.

Gen7 stores the JIP/UIP bits in different places.

Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoi965: Make try_rewrite_rhs_to_dst compare VGRF size to regs written.
Kenneth Graunke [Sun, 9 Dec 2012 23:44:03 +0000 (15:44 -0800)]
i965: Make try_rewrite_rhs_to_dst compare VGRF size to regs written.

try_rewrite_rhs_to_dst is a quick optimization to avoid generating new
temporaries (and MOVs from those temporaries to the dest) for every
expression tree we visit.  By generating better code in simple cases, we
reduce the burden on later optimization passes like register coalescing.

Previously, we compared inst->regs_written() to lhs->vector_elements
to make sure the instruction generating our value wrote the same number
of components as our destination register.

However, this fails in some cases.  One example is texturing (which
produces a vec4) into gl_FragData[i].  Technically, gl_FragData[i] is
also a vec4.  However, the destination VGRF actually has size 4n (where
n is the size of the array).

split_virtual_grfs() can't split VGRFs that are used by SEND messages
which require contiguous destination registers (like texturing), and
register allocation needs all VGRFs to have sizes between 1 and 4.

Amnesia: The Dark Descent hits this case: a texturing instruction
(4 components) gets rewritten to the gl_FragData output register
(which was 4*3 = 12 components), causing the register allocator to
hit the "we rely on split_virtual_grfs" assertion.

This makes it possible to play Amnesia.

Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoconfigure.ac: Disable compiler optimizations when --enable-debug is set
Emil Velikov [Wed, 12 Dec 2012 14:59:36 +0000 (14:59 +0000)]
configure.ac: Disable compiler optimizations when --enable-debug is set

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Dan Nicholson <dbn.lists@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agosoftpipe: remove unused corner0 variable
Brian Paul [Wed, 12 Dec 2012 15:51:19 +0000 (08:51 -0700)]
softpipe: remove unused corner0 variable

11 years agollvmpipe: remove unneeded draw_flush() call
Brian Paul [Mon, 10 Dec 2012 17:32:07 +0000 (10:32 -0700)]
llvmpipe: remove unneeded draw_flush() call

This is redundant since we're calling draw_bind_fragment_shader()
which already does a flush.

v2: the redundant flush in llvmpipe_set_constant_buffer() has
already been removed by commit 3427466e6dbbb8db7c1ecda6b3859ca1cc5827a3

Reviewed-by: José Fonseca <jfonseca@vmware.com>
11 years agor600g: suballocate memory for fetch shaders from a large buffer
Marek Olšák [Sun, 9 Dec 2012 17:51:31 +0000 (18:51 +0100)]
r600g: suballocate memory for fetch shaders from a large buffer

Fetch shaders are usually destroyed at the context destruction by the state
tracker, so we can put them all in a large buffer without wasting memory.

This reduces the number of relocations sent to the kernel a little bit.

Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
11 years agor600g: suballocate memory for the STRMOUT_BUFFER_FILLED_SIZE register
Marek Olšák [Sun, 9 Dec 2012 16:56:26 +0000 (17:56 +0100)]
r600g: suballocate memory for the STRMOUT_BUFFER_FILLED_SIZE register

Instead of having a 4-byte buffer for each streamout target, we suballocate
each dword from a 4K buffer.

This further reduces the overall number of relocations.

Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
11 years agogallium/util: add a simple allocator for suballocating from a large buffer
Marek Olšák [Sun, 9 Dec 2012 16:33:28 +0000 (17:33 +0100)]
gallium/util: add a simple allocator for suballocating from a large buffer

Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
11 years agor600g: use u_upload_mgr for allocating staging transfer buffers
Marek Olšák [Sun, 9 Dec 2012 15:43:16 +0000 (16:43 +0100)]
r600g: use u_upload_mgr for allocating staging transfer buffers

u_upload_mgr suballocates memory from a large buffer and maps the allocated
range (unsychronized), which is perfect for short-lived staging buffers.

This reduces the number of relocations sent to the kernel.

Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
11 years agowinsys/radeon: don't use BIND flags, add a flag for the cache bufmgr instead
Marek Olšák [Sat, 8 Dec 2012 23:02:46 +0000 (00:02 +0100)]
winsys/radeon: don't use BIND flags, add a flag for the cache bufmgr instead

11 years agost/dri: add a way to force MSAA on with an environment variable
Marek Olšák [Mon, 10 Dec 2012 20:35:59 +0000 (21:35 +0100)]
st/dri: add a way to force MSAA on with an environment variable

There are 2 ways. I prefer the former:
  GALLIUM_MSAA=n
  __GL_FSAA_MODE=n

Tested with ETQW, which doesn't support MSAA on Linux. This is
the only way to get MSAA there.

Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agomesa: don't advertise ARB_texture_buffer_object in legacy contexts
Marek Olšák [Sat, 8 Dec 2012 21:53:23 +0000 (22:53 +0100)]
mesa: don't advertise ARB_texture_buffer_object in legacy contexts

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agomesa: disallow creation of GL 3.1 compatibility contexts
Marek Olšák [Sat, 8 Dec 2012 21:48:47 +0000 (22:48 +0100)]
mesa: disallow creation of GL 3.1 compatibility contexts

Death to driver-specific hacks!

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agogallium: remove pipe_surface::usage
Marek Olšák [Sat, 8 Dec 2012 14:37:17 +0000 (15:37 +0100)]
gallium: remove pipe_surface::usage

Not really used by anybody now.

Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agosvga: stop using pipe_surface::usage
Marek Olšák [Sat, 8 Dec 2012 13:53:55 +0000 (14:53 +0100)]
svga: stop using pipe_surface::usage

There are only 2 possible usages: render target and depth stencil.
Both can be derived from the surface format, so the flag is redundant.

And it's going away...

Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agogallium/util: move util_try_blit_via_copy_region to u_surface.c
Marek Olšák [Fri, 7 Dec 2012 19:07:48 +0000 (20:07 +0100)]
gallium/util: move util_try_blit_via_copy_region to u_surface.c

Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agogallium/cso: don't use the pipe_error return type where it's not needed
Marek Olšák [Fri, 7 Dec 2012 18:59:03 +0000 (19:59 +0100)]
gallium/cso: don't use the pipe_error return type where it's not needed

Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agogallium: manage render condition in cso_context and fix postprocessing w/ it
Marek Olšák [Fri, 7 Dec 2012 18:52:00 +0000 (19:52 +0100)]
gallium: manage render condition in cso_context and fix postprocessing w/ it

Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agost/mesa: remove a weird msaa hack
Marek Olšák [Fri, 7 Dec 2012 18:31:33 +0000 (19:31 +0100)]
st/mesa: remove a weird msaa hack

It doesn't work and it's not clear how it's supposed to work.

Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agosoftpipe: implement seamless cubemap support. (v1.1)
Dave Airlie [Tue, 11 Dec 2012 09:52:48 +0000 (19:52 +1000)]
softpipe: implement seamless cubemap support. (v1.1)

This adds seamless sampling for cubemap boundaries if requested.

The corner case averaging is messy but seems like it should be spec
compliant.

The face direction stuff is also a bit messy, I've no idea if that could
or should be simpler, or even if all my directions are fully correct!

v1.1: update comments, drop unneeded seamless calls for nearest, fix
if statement layout.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
11 years agogallium: fix cap warnings for tbo cap.
Dave Airlie [Tue, 11 Dec 2012 21:15:31 +0000 (07:15 +1000)]
gallium: fix cap warnings for tbo cap.

Signed-off-by: Dave Airlie <airlied@redhat.com>
11 years agoglsl_to_tgsi: emit multi-level structs and arrays properly.
Dave Airlie [Mon, 10 Dec 2012 07:20:05 +0000 (17:20 +1000)]
glsl_to_tgsi: emit multi-level structs and arrays properly.

This follow the code from the i965 driver, and emits the structs
and arrays recursively.

This fixes an assert in the two UBO tests
fs-struct-copy-complicated and
vs-struct-copy-complicated

These tests now pass on softpipe, with no regressions.

Signed-off-by: Dave Airlie <airlied@redhat.com>
11 years agollvmpipe: don't use user constant buffers
Brian Paul [Mon, 10 Dec 2012 19:35:23 +0000 (12:35 -0700)]
llvmpipe: don't use user constant buffers

This fixes some use-after-free issues.  I haven't measured any real
performance difference with a handful of Mesa demos.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
11 years agollvmpipe: support pipe_resource-based constant buffers
Brian Paul [Mon, 10 Dec 2012 19:31:46 +0000 (12:31 -0700)]
llvmpipe: support pipe_resource-based constant buffers

Before this we only supported user-based constant buffers.

First, we basically plumb pipe_constant_buffer objects through llvmpipe
rather than pipe_resource objects.

Second, update llvmpipe_set_constant_buffer() and try_update_scene_state()
so they understand both resource- and user-based constant buffers.

The problem with user constant buffers is the potential for use-after-free,
as seen in some WebGL tests.  The next patch will flip the switch for
resource-based const buffers.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
11 years agoutil: add util_copy_constant_buffer() helper function
Brian Paul [Mon, 10 Dec 2012 19:29:08 +0000 (12:29 -0700)]
util: add util_copy_constant_buffer() helper function

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
11 years agoi965/fs: Improve performance of shaders that start out with a discard.
Eric Anholt [Thu, 6 Dec 2012 18:15:08 +0000 (10:15 -0800)]
i965/fs: Improve performance of shaders that start out with a discard.

I had tried this in the past, but ran into trouble with applications
that sample from undiscarded pixels in the same subspan.  To fix that
issue, only jump to the end for an entire subspan at a time.

Improves GLbenchmark 2.7 (1024x768) performance by 7.9 +/- 1.5% (n=8).

v2: Drop the br variable in the jump instruction -- if I ever do jumps
    pre-gen6, it'll be a different code block anyway since we don't have
    HALT until gen6.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/fs: Rewrite discards to use a flag subreg to track discarded pixels.
Eric Anholt [Thu, 6 Dec 2012 20:15:13 +0000 (12:15 -0800)]
i965/fs: Rewrite discards to use a flag subreg to track discarded pixels.

This makes much more sense on gen6+, and will also prove useful for
early exit of shaders on discard.

v2: fix up a stale comment from before converting gen4-5.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/fs: Add an instruction flag for choosing the flag subregister.
Eric Anholt [Thu, 6 Dec 2012 18:36:11 +0000 (10:36 -0800)]
i965/fs: Add an instruction flag for choosing the flag subregister.

We're going to redo discard handling to track discards in the other flag
subregister, saving instructions in the discard and allowing predicated
jumps out to the end of the shader.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Let brw_flag_reg() choose the flag reg and subreg.
Eric Anholt [Thu, 6 Dec 2012 18:43:13 +0000 (10:43 -0800)]
i965: Let brw_flag_reg() choose the flag reg and subreg.

We're about to start using the f0.1 subregister.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Print the flag reg updated by conditional modifiers.
Eric Anholt [Thu, 6 Dec 2012 19:48:25 +0000 (11:48 -0800)]
i965: Print the flag reg updated by conditional modifiers.

This makes our output more consistent with other disasm tools, and
will be necessary when we start using f0.1.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Add the new flag_reg_nr instruction field from IVB.
Eric Anholt [Thu, 6 Dec 2012 19:35:28 +0000 (11:35 -0800)]
i965: Add the new flag_reg_nr instruction field from IVB.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Correct the name and usage of the flag subregister number field.
Eric Anholt [Thu, 6 Dec 2012 18:55:26 +0000 (10:55 -0800)]
i965: Correct the name and usage of the flag subregister number field.

We've been calling it a register number, it's actually the subregister,
and things will get confusing once we start using it if it isn't fixed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Remove bogus flag_reg_nr field from bits3.
Eric Anholt [Thu, 6 Dec 2012 19:31:31 +0000 (11:31 -0800)]
i965: Remove bogus flag_reg_nr field from bits3.

There's a flag subreg nr field in bits2 next to src0.vertstride, but
there shouldn't be anything in bits3 next to src1.vertstride.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agost/egl/drm: only unref the udev device if needed
Tobias Droste [Thu, 29 Nov 2012 16:02:28 +0000 (17:02 +0100)]
st/egl/drm: only unref the udev device if needed

Fixes compiler warning:

drm/native_drm.c: In function ‘native_create_display’:
drm/native_drm.c:180:21: warning: ‘device’ may be used uninitialized in this function [-Wmaybe-uninitialized]
drm/native_drm.c:157:24: note: ‘device’ was declared here

Signed-off-by: Tobias Droste <tdroste@gmx.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
11 years agosoftpipe: Use os_time_get_nano() everywhere.
José Fonseca [Sat, 8 Dec 2012 11:45:58 +0000 (11:45 +0000)]
softpipe: Use os_time_get_nano() everywhere.

11 years agoclover: Install CL headers.
Johannes Obermayr [Tue, 4 Dec 2012 13:18:03 +0000 (14:18 +0100)]
clover: Install CL headers.

Note: This is a candidate for the stable branches.

11 years agogallivm: Lower TGSI_OPCODE_MUL to fmul by default
Tom Stellard [Thu, 6 Dec 2012 19:56:21 +0000 (11:56 -0800)]
gallivm: Lower TGSI_OPCODE_MUL to fmul by default

This fixes a number of crashes on r600g due to the fact that
lp_build_mul assumes vector types when optimizing mul to bit shifts.

This bug was uncovered by 0ad1fefd6951aa47ab58a41dc9ee73083cbcf85c

11 years agollvmpipe: fix txq for 1d/2d arrays. (v3)
Dave Airlie [Sat, 8 Dec 2012 06:00:30 +0000 (06:00 +0000)]
llvmpipe: fix txq for 1d/2d arrays. (v3)

Noticed would fail, we were doing two things wrong

a) 1d arrays require the layers in height
b) minifying the layers field.

v2: don't change height code, fixup completely inside txq
as suggested by Roland.

v3: just add minify before texture array size

v1: Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
11 years agollvmpipe: increase texture target width to reflect increase
Dave Airlie [Sat, 8 Dec 2012 05:41:03 +0000 (05:41 +0000)]
llvmpipe: increase texture target width to reflect increase

Now that we've gone over 7.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
11 years agomesa syncobj: don't store a pointer to the set_entry
Jordan Justen [Sat, 8 Dec 2012 20:43:10 +0000 (12:43 -0800)]
mesa syncobj: don't store a pointer to the set_entry

The set_entry pointer can become invalid if the set table
is re-hashed.

This likely will fix
https://bugs.freedesktop.org/show_bug.cgi?id=58012
(Regression since 56e95d3c)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agovega: remove unused variables
Fabio Pedretti [Fri, 7 Dec 2012 22:00:05 +0000 (23:00 +0100)]
vega: remove unused variables

Signed-off-by: Brian Paul <brianp@vmware.com>
11 years agonvc0: comment unused nvc0_validate_zcull function
Fabio Pedretti [Fri, 7 Dec 2012 22:00:00 +0000 (23:00 +0100)]
nvc0: comment unused nvc0_validate_zcull function

Signed-off-by: Brian Paul <brianp@vmware.com>
11 years agonv50: remove unused OpClassStr array
Fabio Pedretti [Fri, 7 Dec 2012 21:59:53 +0000 (22:59 +0100)]
nv50: remove unused OpClassStr array

Signed-off-by: Brian Paul <brianp@vmware.com>
11 years agor200: fix broken tcl lighting
smoki [Mon, 10 Dec 2012 16:30:26 +0000 (17:30 +0100)]
r200: fix broken tcl lighting

command mistakenly used vector instead of scalar emit (the more or less
identical code in radeon is already correct).
Seems like it would be broken ever since kms probably.
Should fix bugs 22576, 26809.

11 years agost_glsl_to_tgsi: fix ubo bools.
Dave Airlie [Mon, 10 Dec 2012 04:25:49 +0000 (14:25 +1000)]
st_glsl_to_tgsi: fix ubo bools.

This should fix the ubo boolean tests, along with the previous
ubo loading fix.

Signed-off-by: Dave Airlie <airlied@redhat.com>
11 years agost_glsl_to_tgsi: call ubo load pass earlier
Dave Airlie [Mon, 10 Dec 2012 04:22:34 +0000 (14:22 +1000)]
st_glsl_to_tgsi: call ubo load pass earlier

This calls it in around the same place as the 965 driver.

Signed-off-by: Dave Airlie <airlied@redhat.com>
11 years agoglsl_to_tgsi: fix texture offset translation
Dave Airlie [Mon, 10 Dec 2012 02:23:47 +0000 (12:23 +1000)]
glsl_to_tgsi: fix texture offset translation

I noticed the texelFetch offset test failed on 2D rect samplers
with GLSL 1.40. This is because I wrote the immediate->offset
translation wrong.

Fixed the translation to actually use the ureg info to set the
offsets up.

Signed-off-by: Dave Airlie <airlied@redhat.com>
11 years agodrisw: fix up context and apis for software context
Dave Airlie [Sun, 9 Dec 2012 10:28:56 +0000 (20:28 +1000)]
drisw: fix up context and apis for software context

This ports over from the dri2 code to the drisw bits. It means 3.1
core contexts now work for softpipe.

Signed-off-by: Dave Airlie <airlied@redhat.com>
11 years agoi965: Add missing _NEW_BUFFERS dirty bit in Gen7 SBE state.
Kenneth Graunke [Thu, 29 Nov 2012 10:40:09 +0000 (02:40 -0800)]
i965: Add missing _NEW_BUFFERS dirty bit in Gen7 SBE state.

This is needed to compute render_to_fbo.  It even has the comment.

NOTE: This is a candidate for stable branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agost/mesa: set PIPE_BIND_SAMPLER_VIEW for TBOs in st_bufferobj_data
Christoph Bumiller [Sat, 8 Dec 2012 15:02:54 +0000 (16:02 +0100)]
st/mesa: set PIPE_BIND_SAMPLER_VIEW for TBOs in st_bufferobj_data

11 years agonvc0/ir: allow neg,abs modifiers on OP_SET with integer result
Christoph Bumiller [Sat, 8 Dec 2012 18:46:14 +0000 (19:46 +0100)]
nvc0/ir: allow neg,abs modifiers on OP_SET with integer result

11 years agonvc0/ir/emit: fix check for flags register use in logic ops
Christoph Bumiller [Sat, 8 Dec 2012 14:06:43 +0000 (15:06 +0100)]
nvc0/ir/emit: fix check for flags register use in logic ops

11 years agodraw: fix/improve dirty state validation
Brian Paul [Fri, 7 Dec 2012 20:58:34 +0000 (13:58 -0700)]
draw: fix/improve dirty state validation

This patch does two things:

1. Constant buffer state changes were broken (but happened to work by
   dumb luck).  The problem is we weren't calling draw_do_flush() in
   draw_set_mapped_constant_buffer() when we changed that state.  All the
   other draw_set_foo() functions were calling draw_do_flush() already.

2. Use a simpler state validation step when we're changing light-weight
   parameter state such as constant buffers, viewport dims or clip planes.
   There's no need to revalidate the whole pipeline when changing state
   like that.  The new validation method is called bind_parameters()
   and is called instead of the prepare() method.  A new
   DRAW_FLUSH_PARAMETER_CHANGE flag is used to signal these light-weight
   state changes.  This results in a modest but measurable increase in
   FPS for many Mesa demos.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
11 years agodraw: add reminder comments about similar code in different files
Brian Paul [Fri, 7 Dec 2012 19:21:08 +0000 (12:21 -0700)]
draw: add reminder comments about similar code in different files

When one function is changed, also look at the other.
Presently, there are some differences with respect to geometry
shaders and instanced drawing...

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
11 years agodraw: rearrange code in llvm_middle_end_prepare()
Brian Paul [Fri, 7 Dec 2012 19:15:28 +0000 (12:15 -0700)]
draw: rearrange code in llvm_middle_end_prepare()

To clean it up and make it look more like the non-LLVM
fetch_pipeline_prepare() function.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
11 years agodraw: fix comment typo
Brian Paul [Fri, 7 Dec 2012 19:41:22 +0000 (12:41 -0700)]
draw: fix comment typo

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
11 years agodraw: add comment on draw->pt.opt field
Brian Paul [Fri, 7 Dec 2012 19:33:27 +0000 (12:33 -0700)]
draw: add comment on draw->pt.opt field

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
11 years agodraw: update a comment about index buffers
Brian Paul [Fri, 7 Dec 2012 19:26:18 +0000 (12:26 -0700)]
draw: update a comment about index buffers

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
11 years agogallium/os: Fix nano->micro second concersion.
José Fonseca [Sat, 8 Dec 2012 11:15:46 +0000 (11:15 +0000)]
gallium/os: Fix nano->micro second concersion.

copy'n'paste: best friend, worst enemy..

Trivial.

11 years agollvmpipe: fix missing tbo cap warning.
Dave Airlie [Sat, 8 Dec 2012 03:46:32 +0000 (03:46 +0000)]
llvmpipe: fix missing tbo cap warning.

Signed-off-by: Dave Airlie <airlied@redhat.com>
11 years agomesa/st: add ARB_uniform_buffer_object support (v2)
Dave Airlie [Thu, 6 Dec 2012 06:16:10 +0000 (16:16 +1000)]
mesa/st: add ARB_uniform_buffer_object support (v2)

this adds UBO support to the state tracker, it works with softpipe
as-is.

It uses UARL + CONST[x][ADDR[0].x] type constructs.

v2: don't disable UBOs if geom shaders don't exist (me)
rename upload to bind (calim)
fix 12 -> 13 comparison as comment (calim + brianp)
fix signed->unsigned (Brian)
remove assert (Brian)

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
11 years agosoftpipe: enable GLSL 1.40
Dave Airlie [Thu, 6 Dec 2012 06:14:25 +0000 (16:14 +1000)]
softpipe: enable GLSL 1.40

This enables GLSL 1.40 advertising by softpipe.

Signed-off-by: Dave Airlie <airlied@redhat.com>
11 years agosoftpipe: add texture buffer object support
Dave Airlie [Thu, 6 Dec 2012 06:14:03 +0000 (16:14 +1000)]
softpipe: add texture buffer object support

This adds TBO support to softpipe.

Signed-off-by: Dave Airlie <airlied@redhat.com>
11 years agost/mesa: add option to enable GLSL 1.40
Dave Airlie [Thu, 6 Dec 2012 06:13:15 +0000 (16:13 +1000)]
st/mesa: add option to enable GLSL 1.40

Allow GLSL 1.40 to be enabled if the driver advertises it.

Signed-off-by: Dave Airlie <airlied@redhat.com>
11 years agost/mesa: add texture buffer object support to state tracker (v1.1)
Dave Airlie [Thu, 6 Dec 2012 06:12:11 +0000 (16:12 +1000)]
st/mesa: add texture buffer object support to state tracker (v1.1)

This adds the necessary changes to the st to allow texture buffer object
support if the driver advertises it.

v1.1: remove extra blank line and whitespace

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
11 years agogallium: add new texture buffer object capability
Dave Airlie [Thu, 6 Dec 2012 06:10:40 +0000 (16:10 +1000)]
gallium: add new texture buffer object capability

this just adds the define to the header.

Signed-off-by: Dave Airlie <airlied@redhat.com>
11 years agomesa/meta: Move declaration before statements.
José Fonseca [Sat, 8 Dec 2012 01:05:52 +0000 (01:05 +0000)]
mesa/meta: Move declaration before statements.

11 years agomesa: Move declaration before statement.
José Fonseca [Sat, 8 Dec 2012 01:02:30 +0000 (01:02 +0000)]
mesa: Move declaration before statement.

For MSVC's sake.