mesa.git
10 years agor600g/bc: add support for indexed memory writes.
Dave Airlie [Mon, 13 Jan 2014 00:19:00 +0000 (10:19 +1000)]
r600g/bc: add support for indexed memory writes.

It looks like we need these for geom shaders in the future.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: move barrier and end_of_program bits from output to cf struct (v2)
Vadim Girlin [Wed, 31 Jul 2013 16:02:22 +0000 (20:02 +0400)]
r600g: move barrier and end_of_program bits from output to cf struct (v2)

v2: fix regression on r600 NOP instructions.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
10 years agor600g: split streamout emit code into a separate function
Dave Airlie [Wed, 29 Jan 2014 01:33:14 +0000 (01:33 +0000)]
r600g: split streamout emit code into a separate function

For geometry shaders we need to call this code from a second place.

Just move it out for now to keep future patches cleaner.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
10 years agor600g,radeonsi: skip unnecessary buffer_is_busy call, add a comment
Marek Olšák [Sat, 1 Feb 2014 14:06:39 +0000 (15:06 +0100)]
r600g,radeonsi: skip unnecessary buffer_is_busy call, add a comment

10 years agor600g,radeonsi: skip busy-checking for DISCARD_RANGE if it has been done already
Marek Olšák [Sat, 1 Feb 2014 13:59:28 +0000 (14:59 +0100)]
r600g,radeonsi: skip busy-checking for DISCARD_RANGE if it has been done already

10 years agor600g,radeonsi: treat DYNAMIC and STREAM usage as STAGING
Marek Olšák [Sat, 1 Feb 2014 13:01:20 +0000 (14:01 +0100)]
r600g,radeonsi: treat DYNAMIC and STREAM usage as STAGING

10 years agogallium: remove PIPE_CAP_MAX_COMBINED_SAMPLERS
Marek Olšák [Fri, 17 Jan 2014 21:52:28 +0000 (22:52 +0100)]
gallium: remove PIPE_CAP_MAX_COMBINED_SAMPLERS

This can be derived from the shader caps.

All GPUs from ATI/AMD, NVIDIA, and INTEL have separate texture slots
for each shader stage.

10 years agomesa: remove stray bits of GL_EXT_cull_vertex
Brian Paul [Tue, 4 Feb 2014 17:38:59 +0000 (10:38 -0700)]
mesa: remove stray bits of GL_EXT_cull_vertex

GL_EXT_cull_vertex was removed back in 2010 in commit 02984e3536
but these bits still lingered.

Reviewed-by: Eric Anholt <eric@anholt.net>
10 years agoglsl: Fix continue statements in do-while loops.
Paul Berry [Fri, 31 Jan 2014 17:55:35 +0000 (09:55 -0800)]
glsl: Fix continue statements in do-while loops.

From the GLSL 4.40 spec, section 6.4 (Jumps):

    The continue jump is used only in loops. It skips the remainder of
    the body of the inner most loop of which it is inside. For while
    and do-while loops, this jump is to the next evaluation of the
    loop condition-expression from which the loop continues as
    previously defined.

Previously, we incorrectly treated a "continue" statement as jumping
to the top of a do-while loop.

This patch fixes the problem by replicating the loop condition when
converting the "continue" statement to IR.  (We already do a similar
thing in "for" loops, to ensure that "continue" causes the loop
expression to be executed).

Fixes piglit tests:
- glsl-fs-continue-inside-do-while.shader_test
- glsl-vs-continue-inside-do-while.shader_test
- glsl-fs-continue-in-switch-in-do-while.shader_test
- glsl-vs-continue-in-switch-in-do-while.shader_test

Cc: mesa-stable@lists.freedesktop.org
Acked-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglsl: Make condition_to_hir() callable from outside ast_iteration_statement.
Paul Berry [Fri, 31 Jan 2014 17:50:37 +0000 (09:50 -0800)]
glsl: Make condition_to_hir() callable from outside ast_iteration_statement.

In addition to making it public, we also need to change its first
argument from an ir_loop * to an exec_list *, so that it can be used
to insert the condition anywhere in the IR (rather than just in the
body of the loop).

This will be necessary in order to make continue statements work
properly in do-while loops.

Cc: mesa-stable@lists.freedesktop.org
Acked-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965/blorp: do not use unnecessary hw-blending support
Topi Pohjolainen [Mon, 27 Jan 2014 08:50:01 +0000 (10:50 +0200)]
i965/blorp: do not use unnecessary hw-blending support

This is really not needed as blorp blit programs already sample
XRGB normally and get alpha channel set to 1.0 automatically by
the sampler engine. This is simply copied directly to the payload
of the render target write message and hence there is no need for
any additional blending support from the pixel processing pipeline.

The blending formula is anyway broken for color components, it
multiplies the color component with itself (blend factor is the
component itself).
Alpha blending in turn would not fix the alpha to one independent
of the source but simply used the source alpha as is instead
(1.0 * src_alpha + 0.0 * dst_alpha).

Quoting Eric:

 "If we want to actually make the no-alpha-bits-present thing work,
  we need to override the bits in the surface state or in the
  generated code.  In the normal draw path, it's done for sampling
  by the swizzling code in brw_wm_surface_state.c, and the blending
  overrides is just to fix up the alpha blending stage which
  doesn't pay attention to that for the destination surface."

If one modifies piglit test gl-3.2-layered-rendering-blit to use
color component values other than zero or one, this change will
kick in on IVB. No regressions on IVB.

This is effectively revert of c0554141a9b831b4e614747104dcbbe0fe489b9d:

    i965/blorp: Support overriding destination alpha to 1.0.

    Currently, Blorp requires the source and destination formats to be
    equal.  However, we'd really like to be able to blit between XRGB and
    ARGB formats; our BLT engine paths have supported this for a long time.

    For ARGB -> XRGB, nothing needs to occur: the missing alpha is already
    interpreted as 1.0.  For XRGB -> ARGB, we need to smash the alpha
    channel to 1.0 when writing the destination colors.  This is fairly
    straightforward with blending.

    For now, this code is never used, as the source and destination formats
    still must be equal.  The next patch will relax that restriction.

    NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
10 years agoradeon/uvd: fix feedback buffer handling v2
Christian König [Mon, 3 Feb 2014 09:28:58 +0000 (02:28 -0700)]
radeon/uvd: fix feedback buffer handling v2

Without the correct feedback buffer size UVD runs
into an error on each frame, reducing the maximum FPS.

v2: fixing Michels comments

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Cc: "10.1" "10.0" "9.2" <mesa-stable@lists.freedesktop.org>
10 years agoi965: Use brw_bo_map[_gtt]() in intel_miptree_map_raw().
Kenneth Graunke [Wed, 29 Jan 2014 17:27:09 +0000 (09:27 -0800)]
i965: Use brw_bo_map[_gtt]() in intel_miptree_map_raw().

This moves the intel_batchbuffer_flush before the drm_intel_bo_busy
call, which is a change in behavior.  However, the old behavior was
broken.

In the future, we may want to only flush in the batchbuffer references
the BO being mapped.  That's certainly more typical.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
10 years agoi965: Use brw_bo_map() in intel_texsubimage_tiled_memcpy().
Kenneth Graunke [Wed, 29 Jan 2014 17:24:32 +0000 (09:24 -0800)]
i965: Use brw_bo_map() in intel_texsubimage_tiled_memcpy().

This additionally measures the time stalled, while also simplifying the
code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
10 years agoi965: Create drm_intel_bo_map wrappers with performance warnings.
Kenneth Graunke [Wed, 29 Jan 2014 17:09:18 +0000 (09:09 -0800)]
i965: Create drm_intel_bo_map wrappers with performance warnings.

Mapping a buffer is a common place where we could stall the CPU.

In a few places, we've added special code to check whether a buffer is
busy and log the stall as a performance warning.  Most of these give no
indication of the severity of the stall, though, since measuring the
time is a small hassle.

This patch introduces a new brw_bo_map() function which wraps
drm_intel_bo_map, but additionally measures the time stalled and reports
a performance warning.  If performance debugging is not enabled, it
simply maps the buffer with negligable overhead.

We also add a similar wrapper for drm_intel_gem_bo_map_gtt().

This should make it easy to add performance warnings in lots of places.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
10 years agofreedreno: enabling binning and opt by default
Rob Clark [Mon, 3 Feb 2014 16:28:30 +0000 (11:28 -0500)]
freedreno: enabling binning and opt by default

Hw binning pass doesn't seem to have broken anything.  And optimizing
compiler fixes a lot of shaders and doesn't seem to break anything.  So
re-org slightly FD_MESA_DEBUG params and make both hw binning and
optimizer enabled by default.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx/compiler: new compiler
Rob Clark [Wed, 29 Jan 2014 22:18:49 +0000 (17:18 -0500)]
freedreno/a3xx/compiler: new compiler

The new compiler generates a dependency graph of instructions, including
a few meta-instructions to handle PHI and preserve some extra
information needed for register assignment, etc.

The depth pass assigned a weight/depth to each node (based on sum of
instruction cycles of a given node and all it's dependent nodes), which
is used to schedule instructions.  The scheduling takes into account the
minimum number of cycles/slots between dependent instructions, etc.
Which was something that could not be handled properly with the original
compiler (which was more of a naive TGSI translator than an actual
compiler).

The register assignment is currently split out as a standalone pass.  I
expect that it will be replaced at some point, once I figure out what to
do about relative addressing (which is currently the only thing that
should cause fallback to old compiler).

There are a couple new debug options for FD_MESA_DEBUG env var:

  optmsgs - enable debug prints in optimizer
  optdump - dump instruction graph in .dot format, for example:

http://people.freedesktop.org/~robclark/a3xx/frag-0000.dot.png
http://people.freedesktop.org/~robclark/a3xx/frag-0000.dot

At this point, thanks to proper handling of instruction scheduling, the
new compiler fixes a lot of things that were broken before, and does not
appear to break anything that was working before[1].  So even though it
is not finished, it seems useful to merge it in it's current state.

[1] Not merged in this commit, because I'm not sure if it really belongs
in mesa tree, but the following commit implements a simple shader
emulator, which I've used to compare the output of the new compiler to
the original compiler (ie. run it on all the TGSI shaders dumped out via
ST_DEBUG=tgsi with various games/apps):

https://github.com/freedreno/mesa/commit/163b6306b1660e05ece2f00d264a8393d99b6f12

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx/compiler: split out old compiler
Rob Clark [Wed, 29 Jan 2014 22:03:07 +0000 (17:03 -0500)]
freedreno/a3xx/compiler: split out old compiler

For the time being, keep old compiler as fallback for things that the
new compiler does not support yet.  Split out as it's own commit to make
the later new-compiler commits easier to follow.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx/compiler: prepare for new compiler
Rob Clark [Wed, 29 Jan 2014 21:25:52 +0000 (16:25 -0500)]
freedreno/a3xx/compiler: prepare for new compiler

Shuffle things around to prepare for new compiler.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx: remove useless reg tracking in disasm-a3xx
Rob Clark [Wed, 29 Jan 2014 21:13:54 +0000 (16:13 -0500)]
freedreno/a3xx: remove useless reg tracking in disasm-a3xx

Not really used for anything anymore.  So strip it out and avoid
conflicting symbols with upcoming new-compiler.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agodocs: Add release notes for 10.0.3
Carl Worth [Mon, 3 Feb 2014 21:54:50 +0000 (13:54 -0800)]
docs: Add release notes for 10.0.3

Which was just made.

10 years agodraw: fix incorrect color of flat-shaded clipped lines
Brian Paul [Mon, 3 Feb 2014 18:33:03 +0000 (11:33 -0700)]
draw: fix incorrect color of flat-shaded clipped lines

When we clipped a line weren't copying the provoking vertex
color to the second vertex.  We also weren't checking for
first vs. last provoking vertex.

Fixes failures found with the new piglit line-flat-clip-color test.

Cc: "10.0, 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
10 years agomesa: change GL_ALL_ATTRIB_BITS to 0xFFFFFFFF
Brian Paul [Sat, 1 Feb 2014 17:51:43 +0000 (10:51 -0700)]
mesa: change GL_ALL_ATTRIB_BITS to 0xFFFFFFFF

This has been wrong for many years.  It was originally 0x000FFFFF and long
ago there was discussion about whether GL_ALL_ATTRIB_BITS should include
the then-new GL_MULTISAMPLE_BIT bit.  Eventually the ARB decided that
glPushAttrib(GL_ALL_ATTRIB_BITS) should save all current and future
attribute groups (hence ~0).  Unfortunately, Mesa's gl.h was never updated.

This was just recently spotted by Eric Anholt and reported as a bug to the
ARB.  Ian, Jon Leech and I discussed it at the ARB meeting and decided to
change Mesa's value to reflect the ARB's decision.

Acked-by: Eric Anholt <eric@anholt.net>
10 years agogallium/auxiliary/indices: replace free() with FREE()
Brian Paul [Sat, 1 Feb 2014 00:27:04 +0000 (17:27 -0700)]
gallium/auxiliary/indices: replace free() with FREE()

To match the CALLOC_STRUCT() call.

Cc: "10.0, 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
10 years agosvga: check shader size against max command buffer size
Brian Paul [Sat, 1 Feb 2014 00:23:11 +0000 (17:23 -0700)]
svga: check shader size against max command buffer size

If the shader is too large, plug in a dummy shader.  This patch also
reworks the existing dummy shader code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
10 years agosvga: refactor some shader code
Brian Paul [Sat, 1 Feb 2014 00:23:11 +0000 (17:23 -0700)]
svga: refactor some shader code

Put common code in new svga_shader.c file.  Considate separate vertex/
fragment shader ID generation.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
10 years agogallivm: fix opcode and function nesting
Zack Rusin [Tue, 28 Jan 2014 21:34:18 +0000 (16:34 -0500)]
gallivm: fix opcode and function nesting

gallivm soa code supported only a single level of nesting for
control flow opcodes (if, switch, loops...) but the d3d10 spec
clearly states that those are nested within functions. To support
nesting of conditionals inside functions we need to store the
nesting data inside function contexts and keep a stack of those.
Furthermore we make sure that if nesting for subroutines is deeper
than 32 then we simply ignore all subsequent 'call' invocations.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agomesa: Drop unnecessary (void) ctx from VAO code.
Kenneth Graunke [Sun, 2 Feb 2014 06:04:13 +0000 (22:04 -0800)]
mesa: Drop unnecessary (void) ctx from VAO code.

ctx is always used, even on release builds.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: Remove "APPLE" from some VAO error messages.
Kenneth Graunke [Sun, 2 Feb 2014 05:25:42 +0000 (21:25 -0800)]
mesa: Remove "APPLE" from some VAO error messages.

Chances are, people will be using the core names these days.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: Update some comments relating to VAOs.
Kenneth Graunke [Sun, 2 Feb 2014 04:17:06 +0000 (20:17 -0800)]
mesa: Update some comments relating to VAOs.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: Rename ElementArrayBufferObj to IndexBufferObj.
Kenneth Graunke [Sun, 2 Feb 2014 03:46:45 +0000 (19:46 -0800)]
mesa: Rename ElementArrayBufferObj to IndexBufferObj.

DirectX and most hardware documentation use the term "Index Buffer" to
refer to a buffer containing indexes into arrays of vertex data, which
allows random access to vertex data, rather than sequential access.

OpenGL uses a different term for this concept: "Element Array Buffer".
However, "Index Buffer" has become much more widespread.  A quick
Google search shows 29,300 hits for "Element Array Buffer" vs.
82,300 hits for "Index Buffer."

Arguably, "Index Buffer" is clearer: an "element of an array" (or list)
usually refers to an actual item stored in the array, not the index used
to refer to it.

The terminology is also already used in Mesa: some VBO module code for
dealing with ElementArrayBufferObj names local variables "ib".

Completely generated by:
$ find . -type f -print0 | xargs -0 sed -i \
  's/ElementArrayBufferObj/IndexBufferObj/g'

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: Rename _mesa_lookup_arrayobj to _mesa_lookup_vao.
Kenneth Graunke [Sun, 2 Feb 2014 05:28:03 +0000 (21:28 -0800)]
mesa: Rename _mesa_lookup_arrayobj to _mesa_lookup_vao.

For consistency with the previous renames.

Completely generated by:
$ find . -type f -print0 | xargs -0 sed -i \
  's/_mesa_lookup_arrayobj/_mesa_lookup_vao/g'

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: Rename _mesa_..._array_obj functions to _mesa_..._vao.
Kenneth Graunke [Sun, 2 Feb 2014 04:48:51 +0000 (20:48 -0800)]
mesa: Rename _mesa_..._array_obj functions to _mesa_..._vao.

_mesa_update_vao_client_arrays() is less of a mouthful than
_mesa_update_array_object_client_arrays(), and generally clearer.

Generated by:
$ find . -type f -print0 | xargs -0 sed -i \
  's/_mesa_\([^_]*\)_array_object/_mesa_\1_vao/g'
with manual whitespace and indentation fixes applied.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: Rename "struct gl_array_object" to gl_vertex_array_object.
Kenneth Graunke [Sun, 2 Feb 2014 03:36:59 +0000 (19:36 -0800)]
mesa: Rename "struct gl_array_object" to gl_vertex_array_object.

I considered replacing it with "gl_vao", but spelling it out seemed to
fit better with Mesa's traditional style.  Mesa doesn't shy away from
long type names - consider gl_transform_feedback_object,
gl_fragment_program_state, gl_uniform_buffer_binding, and so on.

Completely generated by:
$ find . -type f -print0 | xargs -0 sed -i \
  's/gl_array_object/gl_vertex_array_object/g'

v2: Rerun command to resolve conflicts with Ian's meta patches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: Rename "arrayObj" local variables to "vao".
Kenneth Graunke [Sun, 2 Feb 2014 03:31:22 +0000 (19:31 -0800)]
mesa: Rename "arrayObj" local variables to "vao".

Now that the field is named "VAO" instead of "ArrayObj", it makes sense
to call the local variables "vao" instead of "arrayObj".

Completely generated by:
$ find . -type f -print0 | xargs 0 sed -i 's/arrayObj/vao/g'

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: Rename ArrayObj to VAO and DefaultArrayObj to DefaultVAO.
Kenneth Graunke [Sun, 2 Feb 2014 03:14:38 +0000 (19:14 -0800)]
mesa: Rename ArrayObj to VAO and DefaultArrayObj to DefaultVAO.

When reading through the Mesa drawing code, it's not immediately obvious
to me that "ArrayObj" (gl_array_object) is the Vertex Array Object (VAO)
state.  The comment above the structure explains this, but readers still
have to remember this and translate accordingly.

Out of context, "array object" is a fairly vague.  Even in context,
"array" has a lot of meanings: glDrawArrays, vertex data stored in user
arrays, gl_client_arrays, gl_vertex_attrib_arrays, and so on.

Using the term "VAO" immediately associates these fields with the OpenGL
concept, clarifying the situation and aiding programmer sanity.

Completely generated by:
$ find . -type f -print0 | xargs -0 sed -i \
  -e 's/ArrayObj;/VAO;/g'                  \
  -e 's/->ArrayObj/->VAO/g'                \
  -e 's/Array\.ArrayObj/Array.VAO/g'       \
  -e 's/Array\.DefaultArrayObj/Array.DefaultVAO/g'

v2: Rerun command to resolve conflicts with Ian's meta patches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agometa: Silence several 'unused parameter' warnings
Ian Romanick [Sat, 14 Dec 2013 00:51:04 +0000 (16:51 -0800)]
meta: Silence several 'unused parameter' warnings

Silences many GCC warnings of the form:

drivers/common/meta.c: In function 'cleanup_temp_texture':
drivers/common/meta.c:1208:41: warning: unused parameter 'ctx' [-Wunused-parameter]
drivers/common/meta.c: In function 'setup_ff_blit_framebuffer':
drivers/common/meta.c:1453:46: warning: unused parameter 'ctx' [-Wunused-parameter]
drivers/common/meta.c: In function 'meta_glsl_blit_cleanup':
drivers/common/meta.c:1998:43: warning: unused parameter 'ctx' [-Wunused-parameter]
drivers/common/meta.c: In function 'meta_glsl_clear_cleanup':
drivers/common/meta.c:2287:44: warning: unused parameter 'ctx' [-Wunused-parameter]
drivers/common/meta.c: In function 'setup_ff_generate_mipmap':
drivers/common/meta.c:3365:45: warning: unused parameter 'ctx' [-Wunused-parameter]
drivers/common/meta.c: In function 'meta_glsl_generate_mipmap_cleanup':
drivers/common/meta.c:3556:54: warning: unused parameter 'ctx' [-Wunused-parameter]

There are a couple other similar warnings, but they are less trivial.  I
want to investigate these further before axing them.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agometa: Don't use fixed-function to decompress array textures
Ian Romanick [Fri, 13 Dec 2013 21:40:48 +0000 (13:40 -0800)]
meta: Don't use fixed-function to decompress array textures

Array textures can't be used with fixed-function, so don't.  Instead,
just drop the decompress request on the floor.  This is no worse than
what was done previously because generating the GL error (in
_mesa_set_enable) broke everything anyway.

A later patch will get GL_TEXTURE_2D_ARRAY targets working.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agometa: Use NDC in decompress_texture_image
Ian Romanick [Fri, 13 Dec 2013 22:12:09 +0000 (14:12 -0800)]
meta: Use NDC in decompress_texture_image

There is no need to use pixel coordinates, and using NDC directly will
simplify the GLSL paths.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agometa: Consistenly use non-Apple VAO functions
Ian Romanick [Sat, 14 Dec 2013 19:27:29 +0000 (11:27 -0800)]
meta: Consistenly use non-Apple VAO functions

For these objects, meta was already using the non-Apple function to
delete the objects.  Everywhere else in the file uses
_mesa_GenVertexArrays and _mesa_BindVertexArrays.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
10 years agometa: Fallback to software for GetTexImage of compressed GL_TEXTURE_CUBE_MAP_ARRAY
Ian Romanick [Fri, 13 Dec 2013 23:59:38 +0000 (15:59 -0800)]
meta: Fallback to software for GetTexImage of compressed GL_TEXTURE_CUBE_MAP_ARRAY

The hardware decompression path isn't even close to being able to handle
this.  This converts the crash (assertion failure) in
"EXT_texture_compression_s3tc/getteximage-targets S3TC CUBE_ARRAY" to a
plain old failure.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
10 years agometa: Release resources used by _mesa_meta_DrawPixels
Ian Romanick [Sat, 14 Dec 2013 19:58:45 +0000 (11:58 -0800)]
meta: Release resources used by _mesa_meta_DrawPixels

_mesa_meta_DrawPixels creates a VAO and (potentially) two fragment
programs, but none of them are ever released.  Leaking piles of memory
is generally frowned upon.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
10 years agometa: Release resources used by decompress_texture_image
Ian Romanick [Fri, 13 Dec 2013 22:36:17 +0000 (14:36 -0800)]
meta: Release resources used by decompress_texture_image

decompress_texture_image creates an FBO, an RBO, a VBO, a VAO, and a
sampler object, but none of them are ever released.  Later patches will
add program objects, exacerbating the problem.  Leaking piles of memory
is generally frowned upon.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
10 years agomesa: Use common _mesa_tex_target_to_index in tex param code
Ian Romanick [Sat, 23 Nov 2013 20:16:57 +0000 (12:16 -0800)]
mesa: Use common _mesa_tex_target_to_index in tex param code

TEXTURE_BUFFER_INDEX has to be specially called out because it is not
allowed in any of the glTexParameter or glGetTexParameter functions.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: Make target_enum_to_index available outside texobj.c
Ian Romanick [Fri, 22 Nov 2013 19:35:30 +0000 (11:35 -0800)]
mesa: Make target_enum_to_index available outside texobj.c

The next patch will use this function in another file.

v2: Rename _mesa_target_enum_to_index to _mesa_tex_target_to_index.
Suggested by Brian.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: make several FBO functions static
Brian Paul [Sat, 1 Feb 2014 15:58:43 +0000 (08:58 -0700)]
mesa: make several FBO functions static

The four functions in question weren't called from any other file.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agomesa: move glGenerateMipmap() code into new genmipmap.c file
Brian Paul [Sat, 1 Feb 2014 15:58:43 +0000 (08:58 -0700)]
mesa: move glGenerateMipmap() code into new genmipmap.c file

Mipmap generation has nothing to do with FBOs.
v2: update gl_genexec.py too (not api_exec.c)

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agomesa: move glBlitFramebuffer code into new blit.c file
Brian Paul [Sat, 1 Feb 2014 15:58:43 +0000 (08:58 -0700)]
mesa: move glBlitFramebuffer code into new blit.c file

Just for better organization.
v2: update gl_genexec.py too (not api_exec.c)

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agomesa: don't signal _NEW_TEXTURE in TexSubImage() functions
Brian Paul [Sat, 1 Feb 2014 00:38:35 +0000 (17:38 -0700)]
mesa: don't signal _NEW_TEXTURE in TexSubImage() functions

glTexSubImage(), glCopyTexSubImage() and glCompressedTexSubImage()
only change the texel data, not other state like texture size or format.
If a driver really needs do something special it can hook into the
corresponding driver functions or Map/UnmapTextureImage().

This should avoid some needless state validation effort.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
10 years agomesa: add some comments about mipmap generation
Brian Paul [Sat, 1 Feb 2014 00:38:35 +0000 (17:38 -0700)]
mesa: add some comments about mipmap generation

Trivial.

10 years agomesa: simplify comment in texstorage.c
Brian Paul [Sat, 1 Feb 2014 00:38:35 +0000 (17:38 -0700)]
mesa: simplify comment in texstorage.c

Trivial.

10 years agomesa: formatting fixes, 78-column wrappings in dd.h
Brian Paul [Sat, 1 Feb 2014 00:38:35 +0000 (17:38 -0700)]
mesa: formatting fixes, 78-column wrappings in dd.h

Trivial.

10 years agomesa: remove target param from ctx->Driver.TexParameter()
Brian Paul [Sat, 1 Feb 2014 00:28:08 +0000 (17:28 -0700)]
mesa: remove target param from ctx->Driver.TexParameter()

Not really used anywhere.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agogallivm: add a few const qualifiers
Brian Paul [Sat, 1 Feb 2014 00:28:08 +0000 (17:28 -0700)]
gallivm: add a few const qualifiers

Trivial.

10 years agotranslate: reindent translate_sse.c
Brian Paul [Sat, 1 Feb 2014 00:27:04 +0000 (17:27 -0700)]
translate: reindent translate_sse.c

Trivial.

10 years agomesa: make _mesa_get_proxy_target() static
Brian Paul [Mon, 27 Jan 2014 19:32:28 +0000 (12:32 -0700)]
mesa: make _mesa_get_proxy_target() static

Wasn't used in any other file.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agomesa: remove unused _mesa_select_tex_object() function
Brian Paul [Mon, 27 Jan 2014 19:10:41 +0000 (12:10 -0700)]
mesa: remove unused _mesa_select_tex_object() function

The _mesa_get_current_tex_object() function is now used everywhere that
_mesa_select_tex_object() was formerly used.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoswrast: use _mesa_get_current_tex_object() in swrastSetTexBuffer2()
Brian Paul [Mon, 27 Jan 2014 19:07:05 +0000 (12:07 -0700)]
swrast: use _mesa_get_current_tex_object() in swrastSetTexBuffer2()

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agost/mesa: use _mesa_get_current_tex_object() in st_context_teximage()
Brian Paul [Mon, 27 Jan 2014 19:06:39 +0000 (12:06 -0700)]
st/mesa: use _mesa_get_current_tex_object() in st_context_teximage()

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agomesa: use _mesa_get_current_tex_object() in GetTexLevelParameteriv()
Brian Paul [Mon, 27 Jan 2014 19:05:53 +0000 (12:05 -0700)]
mesa: use _mesa_get_current_tex_object() in GetTexLevelParameteriv()

And update a related comment.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoradeon: use _mesa_get_current_tex_object() in radeonSetTexBuffer2()
Brian Paul [Mon, 27 Jan 2014 19:05:19 +0000 (12:05 -0700)]
radeon: use _mesa_get_current_tex_object() in radeonSetTexBuffer2()

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agor200: use _mesa_get_current_tex_object() in r200SetTexBuffer2()
Brian Paul [Mon, 27 Jan 2014 19:04:01 +0000 (12:04 -0700)]
r200: use _mesa_get_current_tex_object() in r200SetTexBuffer2()

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agobuild: move ARCH_LIBS definition outside of ASM definition
Paul Seidler [Tue, 21 Jan 2014 21:44:37 +0000 (22:44 +0100)]
build: move ARCH_LIBS definition outside of ASM definition

_mesa_streaming_load_memcpy is also needed even if assembling is disabled

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agodri: Add a useful error message if someone's packages missed libudev deps.
Eric Anholt [Thu, 30 Jan 2014 18:44:58 +0000 (10:44 -0800)]
dri: Add a useful error message if someone's packages missed libudev deps.

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agodri: Also support the loader with libudev.so.0.
Eric Anholt [Thu, 30 Jan 2014 18:30:57 +0000 (10:30 -0800)]
dri: Also support the loader with libudev.so.0.

As far as I know, this should be safe.  If not, we have to decide whether
to have variable lookup of the functions, or just drop support for .so.0
(which is a year and a half old it looks like)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74127
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agofreedreno: better manage our WFI's
Rob Clark [Sat, 1 Feb 2014 15:53:00 +0000 (10:53 -0500)]
freedreno: better manage our WFI's

Updates to non-banked registers, CP_LOAD_STATE, etc, need a WFI if there
is potentially pending rendering.  Track this better, and add fd_wfi()
calls everywhere that might potentially need CP_WAIT_FOR_IDLE.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx: add logicop
Rob Clark [Wed, 15 Jan 2014 00:06:46 +0000 (19:06 -0500)]
freedreno/a3xx: add logicop

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx: handle frag z write
Rob Clark [Tue, 14 Jan 2014 18:03:20 +0000 (13:03 -0500)]
freedreno/a3xx: handle frag z write

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno: resync generated headers
Rob Clark [Tue, 14 Jan 2014 12:35:23 +0000 (07:35 -0500)]
freedreno: resync generated headers

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx: fix const confusion
Rob Clark [Tue, 14 Jan 2014 14:54:02 +0000 (09:54 -0500)]
freedreno/a3xx: fix const confusion

Gallium can leave const buffers bound above what is used by the current
shader.  Which can have a couple bad effects:

1) write beyond const space assigned, which can trigger HLSQ lockup
2) double emit of immed consts, first with bound const buffer vals
followed by with actual immed vals.  This seems to be a sort of
undefined condition.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx/compiler: compiler cleanups
Rob Clark [Sun, 10 Nov 2013 23:29:46 +0000 (18:29 -0500)]
freedreno/a3xx/compiler: compiler cleanups

Drop color/pos/psize_regid, plus a few compiler and IR cleanups.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/compiler/a3xx: remove lowered instructions
Rob Clark [Mon, 13 Jan 2014 23:03:42 +0000 (18:03 -0500)]
freedreno/compiler/a3xx: remove lowered instructions

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno: add tgsi lowering pass
Rob Clark [Wed, 15 Jan 2014 13:08:18 +0000 (08:08 -0500)]
freedreno: add tgsi lowering pass

Currently lowers the following instructions:

   DST, XPD, SCS, LRP, FRC, POW, LIT, EXP, LOG, DP4,
   DP3, DPH, DP2

translating these into equivalent simpler TGSI instructions.

This probably should be moved to util so other drivers can use
it, but just adding under freedreno for now so that I can clear
out a lot of the lowering code in a3xx compiler before beginning
to add new compiler.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx/compiler: add CLAMP
Rob Clark [Wed, 15 Jan 2014 13:07:27 +0000 (08:07 -0500)]
freedreno/a3xx/compiler: add CLAMP

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx/compiler: various fixes
Rob Clark [Sun, 12 Jan 2014 18:55:05 +0000 (13:55 -0500)]
freedreno/a3xx/compiler: various fixes

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno: ctx should hold ref to dev
Rob Clark [Sat, 11 Jan 2014 15:34:36 +0000 (10:34 -0500)]
freedreno: ctx should hold ref to dev

The ctx should hold ref to dev to avoid problems if screen is destroyed
before ctx.  Doesn't really fix the egl/glx issues, but at least it
prevents things from getting much worse.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno: add prims-emitted driver query
Rob Clark [Tue, 14 Jan 2014 12:34:41 +0000 (07:34 -0500)]
freedreno: add prims-emitted driver query

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agoi965: Silence unused variable 'ctx' warning.
Kenneth Graunke [Sat, 1 Feb 2014 05:40:05 +0000 (21:40 -0800)]
i965: Silence unused variable 'ctx' warning.

Somehow I missed this before pushing the Broadwell PS state upload code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Fix math instruction hstride assertions on Broadwell.
Kenneth Graunke [Fri, 31 Jan 2014 01:50:02 +0000 (17:50 -0800)]
i965: Fix math instruction hstride assertions on Broadwell.

In the final revision of my gen8_generator patch, I updated the MATH
instruction's assertion from (dst.hstride == 1) to check that source and
destination hstride matched.  Unfortunately, I didn't test this enough,
and many Piglit tests fail this test.

The documentation indicates that "scalar source is also supported",
which we believe means <0,1,0> access mode (hstride == 0).  If hstride
is non-zero, then it must match the destination register.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965: Add (disabled) Broadwell PCI IDs.
Kenneth Graunke [Fri, 1 Nov 2013 18:41:34 +0000 (11:41 -0700)]
i965: Add (disabled) Broadwell PCI IDs.

This puts the PCI IDs in place so it's easy to enable support.  However,
it doesn't actually enable support since it's very preliminary still,
and a few crucial pieces (such as BLORP) are still missing.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Eric Anholt <eric@anholt.net>
10 years agoi965: Disable 3DSTATE_WM_HZ_OP fields.
Kenneth Graunke [Fri, 6 Dec 2013 11:07:54 +0000 (03:07 -0800)]
i965: Disable 3DSTATE_WM_HZ_OP fields.

Eric believes this to be wrong and unnecessary, as the command is
supposed to emit an implicit rectangle primitive.  However, empirically
the pixel pipeline is completely unreliable without it.  So for now, it
stays until someone comes up with a better solution.

We'll need to do better than this when we implement multisampling, HiZ,
or fast clears...but for now, this will do.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Eric Anholt <eric@anholt.net>
10 years agoi965: Update GS state for Broadwell.
Kenneth Graunke [Tue, 5 Nov 2013 07:19:55 +0000 (23:19 -0800)]
i965: Update GS state for Broadwell.

This is quite similar to the Gen7 code.  The main changes:
 - 48-bit relocations
 - Thread count is specified as U/2-1 instead of U-1.
 - An extra DWord (DW9) with clip planes, URB entry output length/offsets
 - We need to program the "Expected Vertex Count" (VerticesIn)

v2: Set the number of binding table entries so they can be prefetched
    (requested by Eric Anholt).
v3: Add a WARN_ONCE for a missing workaround.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
10 years agoi965: Update multisampling state for Broadwell.
Kenneth Graunke [Mon, 3 Dec 2012 23:20:37 +0000 (15:20 -0800)]
i965: Update multisampling state for Broadwell.

On previous platforms, 3DSTATE_MULTISAMPLE contained the number of
samples, pixel location, and the positions of each sample within a pixel
for each multisampling mode (4x and 8x).  It was also a non-pipelined
command, presumably since changing the sample positions is fairly
drastic.

Broadwell improves upon this by splitting the sample positions out into
a separate non-pipelined state packet, 3DSTATE_SAMPLE_PATTERN.  With
that removed, 3DSTATE_MULTISAMPLE becomes a pipelined state packet.

Broadwell also supports 2x and 16x multisampling, in addition to the 4x
and 8x supported by Gen7.  This patch, however, does not implement 2x
and 16x.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
10 years agoi965: Update 3DSTATE_{DEPTH,STENCIL,...}_BUFFER and such for Broadwell.
Kenneth Graunke [Fri, 14 Dec 2012 11:58:30 +0000 (03:58 -0800)]
i965: Update 3DSTATE_{DEPTH,STENCIL,...}_BUFFER and such for Broadwell.

The amount of cut and paste from Gen7 is rather ugly, and should
probably be cleaned up in the future.  Even the Gen7 code is in need of
some tidying though; many of the function parameters aren't used on
platforms that use level/layer rather than tile offsets.  Tidying both
can be left to a future patch series.  This at least gets things going.

v2: Rebase on Paul's rename of NumLayers -> MaxNumLayers.

v3: Shift QPitch by 2 when storing it in the packet.  Bits 14:0 store
    bits 16:2 of the actual value.  Fixes tests.

v4: Add missing stencil buffer QPitch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Eric Anholt <eric@anholt.net>
10 years agoi965: Update BLEND_STATE for Broadwell.
Kenneth Graunke [Thu, 6 Dec 2012 03:30:26 +0000 (19:30 -0800)]
i965: Update BLEND_STATE for Broadwell.

v2: Allow logic ops on all surface types.  The UNORM restriction was
    lifted with Haswell and I simply hadn't noticed.  Also, add missing
    BRW_NEW_STATE_BASE_ADDRESS dirty bit.  Both caught by Eric Anholt.

v3: Fix swapped per-RT DWord pairs.  Eliminates bizarre hacks.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
10 years agoi965: Update SF_CLIP_VIEWPORT for Broadwell.
Kenneth Graunke [Wed, 5 Dec 2012 23:34:34 +0000 (15:34 -0800)]
i965: Update SF_CLIP_VIEWPORT for Broadwell.

It has additional fields to support clipping to the viewport even if
guardband clipping is enabled.

v2: Update for viewport array changes.
v3: No, seriously, update for viewport array changes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> [v1]
10 years agoi965: Rework SURFACE_STATE entries for Broadwell.
Kenneth Graunke [Wed, 5 Dec 2012 00:39:03 +0000 (16:39 -0800)]
i965: Rework SURFACE_STATE entries for Broadwell.

v2: Add missing SCS setting in gen8_emit_buffer_surface_state (caught by
    Eric Anholt).

v3: Use stored QPitch rather than recomputing it.

v4: Shift QPitch by 2 when setting it in the packet; bits 14:0 store
    bits 16:2 of the actual value (fixes myriads of cube and array
    texturing tests).  Also, only enable cube face bits for cubemaps
    (matches Chris Forbes' commit on master).  Port to use offset64.

v5: s/gl_format/mesa_format/g

v6: Fix DW5 of renderbuffer state, which neglected to subtract
    irb->mt->first_level.  Use vertical_alignment() rather than
    hardcoding 4.  Use ffs for multisample counts rather than a
    large switch statement (all caught/suggested by Eric).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
10 years agoi965: Update SOL state for Broadwell.
Kenneth Graunke [Tue, 4 Dec 2012 22:45:19 +0000 (14:45 -0800)]
i965: Update SOL state for Broadwell.

Unlike on Gen7, we can directly set the offset via the state packet.
We also -have- to: the kernel SOL reset code won't work anymore.

v2: Fix copy and paste mistake in buffer stride setup; drop stale
    comment (caught by Eric Anholt).  Add a perf_debug for missing
    MOCS setup.

v3: Rebase on Paul Berry's changes to CurrentVertexProgram.

v4: Fix SO Write Offset handling.  We need to set bits 20 and 21 so the
    hardware both loads and saves the offset.  There's also a
    restriction that 3DSTATE_SO_BUFFER can only be programmed once per
    buffer between primitives, so the "reset to zero" code needed
    reworking.  Fixes most of the transform feedback Piglit tests.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> [v2]
10 years agoi965: Update the code that disables unused shader stages for Broadwell.
Kenneth Graunke [Thu, 29 Nov 2012 05:39:19 +0000 (21:39 -0800)]
i965: Update the code that disables unused shader stages for Broadwell.

v2: Also disable 3DSTATE_WM_CHROMAKEY for safety.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> [v1]
10 years agoi965: Update 3DSTATE_CLIP for Broadwell.
Kenneth Graunke [Fri, 1 Nov 2013 23:29:33 +0000 (16:29 -0700)]
i965: Update 3DSTATE_CLIP for Broadwell.

Broadwell's winding order, polygon fill, and viewport Z test fields have
moved to DWord 1 of 3DSTATE_RASTER.

v2: Add a perf_debug for a future optimization and improve commit
    message (both suggested by Eric Anholt).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
10 years agoi965: Rework vertex uploads for Broadwell.
Kenneth Graunke [Tue, 4 Dec 2012 02:28:29 +0000 (18:28 -0800)]
i965: Rework vertex uploads for Broadwell.

v2: Emit a dummy 3DSTATE_VF_SGVS packet when not needed.

v3: Add WARN_ONCE and perf_debugs requested by Eric Anholt.

v4: Program 3DSTATE_SGVS even in the no-elements case so gl_VertexID
    continues working.  Fix 3DSTATE_VF_INSTANCING to not use an
    element index to access the buffers array.  Some ARB_draw_indirect
    prep work.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
10 years agoi965: Update STATE_BASE_ADDRESS for Broadwell.
Kenneth Graunke [Mon, 3 Dec 2012 21:53:40 +0000 (13:53 -0800)]
i965: Update STATE_BASE_ADDRESS for Broadwell.

v2: Fix missing "change" bit on instruction state base address
    (caught by Haihao Xiang).

v3: Add a perf_debug for missing MOCS setup, requested by Eric.

v4: Fix buffer sizes.  The value, specified at bit 12 and up, is
    actually measured in 4k pages.  We need to round up to the
    next multiple of 4k.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> [v3]
Reviewed-by: Matt Turner <mattst88@gmail.com> [v4]
10 years agoi965: Update 3DSTATE_PS, 3DSTATE_WM, and add 3DSTATE_PS_EXTRA.
Kenneth Graunke [Fri, 30 Nov 2012 05:00:27 +0000 (21:00 -0800)]
i965: Update 3DSTATE_PS, 3DSTATE_WM, and add 3DSTATE_PS_EXTRA.

v2: Fix setting of GEN8_PSX_ATTRIBUTE_ENABLE after rebases.

v3: Add missing binding table entry counts.  Don't worry about alpha
    testing or alpha to coverage when setting the "Kill Pixel" bit;
    those are specified in 3DSTATE_PS_BLEND (caught by Eric Anholt).
    Drop unused _NEW_BUFFERS.  Tidy comments.

v4: Rebase on Paul Berry's changes to CurrentFragmentProgram.

v5: Re-enable line stippling.  It doesn't crash or anything.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> [v3]
10 years agoi965: Rework 3DSTATE_VS for Broadwell.
Kenneth Graunke [Thu, 29 Nov 2012 09:10:19 +0000 (01:10 -0800)]
i965: Rework 3DSTATE_VS for Broadwell.

v2: Remove incorrect MOCS shifts; rename urb_entry_write_offset to
    urb_entry_output_offset to closer match the documentation.

v3: Only emit a non-zero constant buffer read length when active.

v4: Add missing binding table counts (caught by Eric).

v5: Rebase on Paul Berry's changes to CurrentVertexProgram.

v6: Drop bogus SBE read length/offset field code.  We were programming
    the wrong values, and our 3DSTATE_SBE code overrides any value we
    put here anyway with the correct one.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> [v4]
10 years agoi965: Add the new 3DSTATE_PS_BLEND state packet.
Kenneth Graunke [Fri, 30 Nov 2012 02:43:59 +0000 (18:43 -0800)]
i965: Add the new 3DSTATE_PS_BLEND state packet.

v2: Only set GEN8_PS_BLEND_HAS_WRITEABLE_RT if color buffer writes are
    enabled (caught by Eric Anholt).

v3: Set non-blending flags (writeable RT, alpha test, alpha to coverage)
    for integer formats too.  +14 Piglits.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> [v2]
10 years agoi965: Replace DEPTH_STENCIL_STATE with Gen8's 3DSTATE_WM_DEPTH_STENCIL.
Kenneth Graunke [Fri, 30 Nov 2012 01:52:31 +0000 (17:52 -0800)]
i965: Replace DEPTH_STENCIL_STATE with Gen8's 3DSTATE_WM_DEPTH_STENCIL.

v2: Use stencil->_WriteEnabled instead of setting
    GEN8_WM_DS_STENCIL_BUFFER_WRITE_ENABLE twice (suggested by Eric).

v3: Mask stencil->WriteMask and stencil->ValueMask with 0xff.  The field
    is only 8-bits, so we'd trip the new SET_FIELD assertion when core
    Mesa gave us a value like 0xFFFFFFFF.  The Gen7 code uses structure
    field widths to implicitly do this truncation.  Fixes Piglit tests.

v4: Use uint32_t for dw1/dw2, not uint8_t.  Worst. Typo. Ever.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> [v2]
10 years agoi965: Update SF, SBE, and RASTER state for Broadwell.
Kenneth Graunke [Fri, 1 Nov 2013 21:37:33 +0000 (14:37 -0700)]
i965: Update SF, SBE, and RASTER state for Broadwell.

The attribute override portion of 3DSTATE_SBE was split out into
3DSTATE_SBE_SWIZ; various bits of 3DSTATE_SF were split out into
3DSTATE_RASTER.

v2: Set Force URB Read Offset bit.  Eventually the URB read offset
    should be set in 3DSTATE_VS, but that will require some refactoring.

v3: Rebase on viewport array changes.

v4: Improve comments about URB read length/offset overrides.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
10 years agoi965: Bump generation assertions on workaround flushes.
Kenneth Graunke [Thu, 29 Nov 2012 09:50:22 +0000 (01:50 -0800)]
i965: Bump generation assertions on workaround flushes.

I haven't investigated whether these are necessary on Broadwell or not,
but for paranoia's sake, we may as well continue doing them for now.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
10 years agoi965: Duplicate gen7_atoms to gen8_atoms.
Kenneth Graunke [Thu, 29 Nov 2012 05:16:18 +0000 (21:16 -0800)]
i965: Duplicate gen7_atoms to gen8_atoms.

It's going to diverge significantly.  Starting out with a copy allows
future patches to change atoms one by one.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
10 years agoradeon: move driContextSetFlags(ctx) call after ctx var is initialized
Brian Paul [Sat, 1 Feb 2014 00:09:44 +0000 (17:09 -0700)]
radeon: move driContextSetFlags(ctx) call after ctx var is initialized

CC: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>