mesa.git
8 years agost/vdpau: use linear layout for output surfaces
Christian König [Thu, 14 Jan 2016 12:40:25 +0000 (13:40 +0100)]
st/vdpau: use linear layout for output surfaces

Works around a bug in radeonsi and tiling is actually
not very beneficial in this use case.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
8 years agoradeonsi: ignore PIPE_BIND_LINEAR in si_is_format_supported v2
Christian König [Thu, 14 Jan 2016 12:38:10 +0000 (13:38 +0100)]
radeonsi: ignore PIPE_BIND_LINEAR in si_is_format_supported v2

Linear layout should work for all not compressed or depth/stencil formats.

v2: restrict it a bit more

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agost/mesa: enable OES_texture_buffer when all components available
Ilia Mirkin [Tue, 29 Mar 2016 00:59:13 +0000 (20:59 -0400)]
st/mesa: enable OES_texture_buffer when all components available

OES_texture_buffer combines bits from a number of desktop extensions.
When they're all available, turn it on.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agoglapi/glx: Mark the indirect swapped dispatch functions _X_COLD
Adam Jackson [Thu, 24 Mar 2016 17:57:58 +0000 (13:57 -0400)]
glapi/glx: Mark the indirect swapped dispatch functions _X_COLD

A modest size savings:

   text    data     bss     dec     hex filename
 264143   15608     232  279983   445af libglx.so.before
 254303   15608     232  270143   41f3f libglx.so.after

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
8 years agoglapi/glx: Sync some additional error checking from xserver
Adam Jackson [Thu, 24 Mar 2016 17:57:58 +0000 (13:57 -0400)]
glapi/glx: Sync some additional error checking from xserver

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
8 years agoglsl: raise warning when using uninitialized variables
Alejandro Piñeiro [Tue, 23 Feb 2016 10:48:52 +0000 (11:48 +0100)]
glsl: raise warning when using uninitialized variables

v2:
 * Take into account out varyings too (Timothy Arceri)
 * Fix style (Timothy Arceri)
 * Use a new ast_expression variable, instead of an
   ast_expression::hir new parameter (Timothy Arceri)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94129

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
8 years agoglsl: add is_lhs bool on ast_expression
Alejandro Piñeiro [Thu, 25 Feb 2016 10:11:54 +0000 (11:11 +0100)]
glsl: add is_lhs bool on ast_expression

Useful to know if a expression is the recipient of an assignment
or not, that would be used to (for example) raise warnings of
"use of uninitialized variable" without getting a false positive
when assigning first a variable.

By default the value is false, and it is assigned to true on
the following cases:
 * The lhs assignments subexpression
 * At ast_array_index, on the array itself.
 * While handling the method on an array, to avoid the warning
   calling array.length
 * When computed the cached test expression at test_to_hir, to
   avoid a duplicate warning on the test expression of a switch.

set_is_lhs setter is added, because in some cases (like ast_field_selection)
the value need to be propagated on the expression tree. To avoid doing the
propatagion if not needed, it skips if no primary_expression.identifier is
available.

v2: use a new bool on ast_expression, instead of a new parameter
    on ast_expression::hir (Timothy Arceri)

v3: fix style and some typos on comments, initialize is_lhs default value
    on constructor, to avoid a c++11 feature (Ian Romanick)

v4: some tweaks on comments (Timothy Arceri)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94129

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
8 years agonir: Add a helper for getting the current block from a cursor
Jason Ekstrand [Fri, 25 Mar 2016 21:16:47 +0000 (14:16 -0700)]
nir: Add a helper for getting the current block from a cursor

Reviewed-by: Rob Clark <robdclark@gmail.com>
8 years agonir/lower_out_to_temp: Add an "entrypoint" parameter
Jason Ekstrand [Fri, 25 Mar 2016 21:11:19 +0000 (14:11 -0700)]
nir/lower_out_to_temp: Add an "entrypoint" parameter

Previously, the pass assumed that the entrypoint would be whatever function
happened to have the name "main".  We really shouldn't trust in the
function names.

Reviewed-by: Rob Clark <robdclark@gmail.com>
8 years agonir/lower_out_to_temp: Steal the output's constant initializer
Jason Ekstrand [Fri, 25 Mar 2016 21:17:18 +0000 (14:17 -0700)]
nir/lower_out_to_temp: Steal the output's constant initializer

Reviewed-by: Rob Clark <robdclark@gmail.com>
8 years agonir: Add a helper for getting the unique function in a shader
Jason Ekstrand [Fri, 25 Mar 2016 21:07:41 +0000 (14:07 -0700)]
nir: Add a helper for getting the unique function in a shader

Reviewed-by: Rob Clark <robdclark@gmail.com>
8 years agonir/sweep: Sweep function parameters
Jason Ekstrand [Fri, 25 Mar 2016 18:10:30 +0000 (11:10 -0700)]
nir/sweep: Sweep function parameters

They are no longer in the list of local variables so we need to explicitly
sweep them.

Reviewed-by: Rob Clark <robdclark@gmail.com>
8 years agonir/builder: Add a helper for creating undefs
Jason Ekstrand [Fri, 25 Mar 2016 17:43:46 +0000 (10:43 -0700)]
nir/builder: Add a helper for creating undefs

Reviewed-by: Rob Clark <robdclark@gmail.com>
8 years agonir/builder: Add a helper for storing to variable derefs
Jason Ekstrand [Fri, 25 Mar 2016 17:35:03 +0000 (10:35 -0700)]
nir/builder: Add a helper for storing to variable derefs

Reviewed-by: Rob Clark <robdclark@gmail.com>
8 years agonir/builder: Add a helper for building fdot instructions
Jason Ekstrand [Fri, 25 Mar 2016 17:34:17 +0000 (10:34 -0700)]
nir/builder: Add a helper for building fdot instructions

Reviewed-by: Rob Clark <robdclark@gmail.com>
8 years agonir: Add a variable_foreach_safe helper
Jason Ekstrand [Fri, 25 Mar 2016 17:18:35 +0000 (10:18 -0700)]
nir: Add a variable_foreach_safe helper

Reviewed-by: Rob Clark <robdclark@gmail.com>
8 years agonir/Makefile: Fix alphabetization
Jason Ekstrand [Fri, 25 Mar 2016 17:08:50 +0000 (10:08 -0700)]
nir/Makefile: Fix alphabetization

Reviewed-by: Rob Clark <robdclark@gmail.com>
8 years agomesa: add OES_texture_buffer and EXT_texture_buffer support
Ilia Mirkin [Sat, 27 Feb 2016 21:16:28 +0000 (16:16 -0500)]
mesa: add OES_texture_buffer and EXT_texture_buffer support

Allow ES 3.1 contexts to access the texture buffer functionality.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agoglsl: add OES_texture_buffer and EXT_texture_buffer support
Ilia Mirkin [Sat, 27 Feb 2016 21:13:50 +0000 (16:13 -0500)]
glsl: add OES_texture_buffer and EXT_texture_buffer support

Expose the samplerBuffer/imageBuffer types, and allow the various
functions to operate on them.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agomesa: add OES_texture_buffer and EXT_texture_buffer extension to table
Ilia Mirkin [Sat, 27 Feb 2016 21:06:42 +0000 (16:06 -0500)]
mesa: add OES_texture_buffer and EXT_texture_buffer extension to table

We need to add a new bit since the GL ES exts require functionality from
a combination of texture buffer extensions as well as images (for
imageBuffer) support. Additionally, not all GPUs support all the texture
buffer functionality (e.g. rgb32 isn't supported by nv50).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agomesa: properly return GetTexLevelParameter queries for buffer textures
Ilia Mirkin [Sat, 27 Feb 2016 21:04:51 +0000 (16:04 -0500)]
mesa: properly return GetTexLevelParameter queries for buffer textures

This fixes all failures with dEQP tests in this area. While
ARB_texture_buffer_object explicitly says that GetTexLevelParameter & co
should not be supported, GL 3.1 reverses this decision and allows all of
these queries there.

Conversely, there is no text that forbids the buffer-specific queries
from being used with non-buffer images.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agoglsl: Delete initialized field from uniform storage test.
Kenneth Graunke [Mon, 28 Mar 2016 23:57:19 +0000 (16:57 -0700)]
glsl: Delete initialized field from uniform storage test.

Timothy deleted this field.  Fixes "make check".

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
8 years agomesa: remove initialized field from uniform storage
Timothy Arceri [Sun, 27 Mar 2016 03:51:02 +0000 (14:51 +1100)]
mesa: remove initialized field from uniform storage

The only place this was used was in a gallium debug function that
had to be manually enabled.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agonvc0: use a different offset for buffers and surfaces
Samuel Pitoiset [Mon, 28 Mar 2016 10:43:01 +0000 (12:43 +0200)]
nvc0: use a different offset for buffers and surfaces

To not overwrite buffers and surfaces information, we need to use
a different offset in the driver constant buffer. Currently, OP_SUQ
is only supported for buffers but this will be slightly updated for
images support.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agoi965: Set address rounding bits for GL_NEAREST filtering as well.
Kenneth Graunke [Tue, 8 Mar 2016 07:54:53 +0000 (23:54 -0800)]
i965: Set address rounding bits for GL_NEAREST filtering as well.

Yuanhan Liu decided these were useful for linear filtering in
commit 76669381 (circa 2011).  Prior to that, we never set them;
it seems he tried to preserve that behavior for nearest filtering.

It turns out they're useful for nearest filtering, too: setting
these fixes the following dEQP-GLES3 tests:

functional.fbo.blit.rect.nearest_consistency_mag
functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_x
functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_y
functional.fbo.blit.rect.nearest_consistency_mag_reverse_dst_x
functional.fbo.blit.rect.nearest_consistency_mag_reverse_dst_y
functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_dst_x
functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_dst_y
functional.fbo.blit.rect.nearest_consistency_min
functional.fbo.blit.rect.nearest_consistency_min_reverse_src_x
functional.fbo.blit.rect.nearest_consistency_min_reverse_src_y
functional.fbo.blit.rect.nearest_consistency_min_reverse_dst_x
functional.fbo.blit.rect.nearest_consistency_min_reverse_dst_y
functional.fbo.blit.rect.nearest_consistency_min_reverse_src_dst_x
functional.fbo.blit.rect.nearest_consistency_min_reverse_src_dst_y
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_x
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_y
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_dst_x
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_dst_y
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_dst_x
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_dst_y
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_x
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_y
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_dst_x
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_dst_y
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_dst_x
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_dst_y

Apparently, BLORP has always set these bits unconditionally.

However, setting them unconditionally appears to regress tests using
texture projection, 3D samplers, integer formats, and vertex shaders,
all in combination, such as:

functional.shaders.texture_functions.textureprojlod.isampler3d_vertex

Setting them on Gen4-5 appears to regress Piglit's
tests/spec/arb_sampler_objects/framebufferblit.

Honestly, it looks like the real problem here is a lack of precision.
I'm just hacking around problems here (as embarassing as it is).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
8 years agoi965: Always use BRW_TEXCOORDMODE_CUBE when seamless filtering.
Kenneth Graunke [Wed, 23 Mar 2016 18:56:39 +0000 (11:56 -0700)]
i965: Always use BRW_TEXCOORDMODE_CUBE when seamless filtering.

When using seamless cube map mode and NEAREST filtering, we explicitly
overrode the wrap modes to CLAMP_TO_EDGE.  This was to implement the
following spec text:

   "If NEAREST filtering is done within a miplevel, always apply apply
    wrap mode CLAMP_TO_EDGE."

However, textureGather() ignores the sampler's filtering mode, and
instead returns the four pixels that would be blended by LINEAR
filtering.  This implies that we should do proper seamless filtering,
and include pixels from adjacent cube faces.

It turns out that we can simply delete the NEAREST -> CLAMP_TO_EDGE
overrides.  Normal cube map sampling works by first selecting the
face, and then nearest filtering fetches the closest texel.  If the
nearest texel was on a different face, then that face would have been
chosen.  So it should always be within the face anyway, which
effectively performs CLAMP_TO_EDGE.

Fixes 86 dEQP-GLES31.texture.gather.basic.cube.* tests.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Suggested-by: Ian Romanick <idr@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965: Fix brw_render_cache_set_check_flush's PIPE_CONTROLs.
Kenneth Graunke [Fri, 25 Mar 2016 22:33:35 +0000 (15:33 -0700)]
i965: Fix brw_render_cache_set_check_flush's PIPE_CONTROLs.

Our driver uses the brw_render_cache mechanism to track buffers we've
rendered to and are about to sample from.

Previously, we did a single PIPE_CONTROL with the following bits set:
- Render Target Flush
- Depth Cache Flush
- Texture Cache Invalidate
- VF Cache Invalidate
- Instruction Cache Invalidate
- CS Stall

This combined both "top of pipe" invalidations and "bottom of pipe"
flushes, which isn't how the hardware is intended to be programmed.

The "top of pipe" invalidations may happen right away, without any
guarantees that rendering using those caches has completed.  That
rendering may continue altering the caches.  The "bottom of pipe"
flushes do wait for the rendering to complete.  The CS stall also
prevents further work from happening until data is flushed out.

What we wanted to do was wait for rendering complete, flush the new
data out of the render and depth caches, wait, then invalidate any
stale data in read-only caches.  We can accomplish this by doing the
"bottom of pipe" flushes with a CS stall, then the "top of pipe"
flushes as a second PIPE_CONTROL.  The flushes will wait until the
rendering is complete, and the CS stall will prevent the second
PIPE_CONTROL with the invalidations from executing until the first
is done.

Fixes dEQP-GLES3.functional.texture.specification.teximage2d_pbo
subtests on Braswell and Skylake.  These tests hit the meta PBO
texture upload path, which binds the PBO as a texture and samples
from it, while rendering to the destination texture.  The tests
then sample from the texture.

For now, we leave Gen4-5 alone.  It probably needs work too, but
apparently it hasn't even been setting the (G45+) TC invalidation
bit at all...

v2: Add Sandybridge post-sync non-zero workaround, for safety.

Cc: mesa-stable@lists.freedesktop.org
Suggested-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
8 years agoi965: Whack UAV bit when FS discards and there are no color writes.
Kenneth Graunke [Thu, 24 Mar 2016 23:21:35 +0000 (16:21 -0700)]
i965: Whack UAV bit when FS discards and there are no color writes.

dEQP-GLES31.functional.fbo.no_attachments.* draws a quad with no
framebuffer attachments, using a shader that discards based on
gl_FragCoord.  It uses occlusion queries to inspect whether pixels
are rendered or not.

Unfortunately, the hardware is not dispatching any pixel shaders,
so discards never happen, and the full quad of pixels increments
PS_DEPTH_COUNT, making the occlusion query results bogus.

To understand why, we have to delve into the WM_INT internal
signalling mechanism's formulas.

The "WM_INT::Pixel Shader Kill Pixel" signal is defined as:

    3DSTATE_WM::ForceKillPixel == ON ||
    (3DSTATE_WM::ForceKillPixel != Off &&
     !WM_INT::WM_HZ_OP &&
     3DSTATE_WM::EDSC_Mode != PREPS &&
     (WM_INT::Depth Write Enable || WM_INT::Stencil Write Enable) &&
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     (3DSTATE_PS_EXTRA::PixelShaderKillsPixels ||
      3DSTATE_PS_EXTRA:: oMask Present to RenderTarget ||
      3DSTATE_PS_BLEND::AlphaToCoverageEnable ||
      3DSTATE_PS_BLEND::AlphaTestEnable ||
      3DSTATE_WM_CHROMAKEY::ChromaKeyKillEnable))

Because there is no depth or stencil buffer, writes to those buffers
are disabled.  So the highlighted condition is false, making the whole
"Kill Pixel" condition false.  This then feeds into the following
"WM_INT::ThreadDispatchEnable" condition:

    3DSTATE_WM::ForceThreadDispatch != OFF &&
    !WM_INT::WM_HZ_OP &&
    3DSTATE_PS_EXTRA::PixelShaderValid &&
    (3DSTATE_PS_EXTRA::PixelShaderHasUAV ||
     WM_INT::Pixel Shader Kill Pixel ||
     WM_INT::RTIndependentRasterizationEnable ||
     (!3DSTATE_PS_EXTRA::PixelShaderDoesNotWriteRT &&
      3DSTATE_PS_BLEND::HasWriteableRT) ||
     (WM_INT::Pixel Shader Computed Depth Mode != PSCDEPTH_OFF &&
      (WM_INT::Depth Test Enable || WM_INT::Depth Write Enable)) ||
     (3DSTATE_PS_EXTRA::Computed Stencil && WM_INT::Stencil Test Enable) ||
     (3DSTATE_WM::EDSC_Mode == 1 && (WM_INT::Depth Test Enable ||
                                     WM_INT::Depth Write Enable ||
                                     WM_INT::Stencil Test Enable)))

Given that there's no depth/stencil testing, no writeable render target,
and the hardware thinks kill pixel doesn't happen, all of these
conditions are false.  We have to whack some bit to make PS invocations
happen.  There are many options.

Curro suggested using the UAV bit.  There's some precedence in doing
that - we set it for fragment shaders that do SSBO/image/atomic writes
when no color buffer writes are enabled.  We can simply include discard
here too.

Fixes 64 dEQP-GLES31.functional.fbo.no_attachments.* tests.

v2: Add a comment suggested and written by Jason Ekstrand.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agovc4: Remove unused include from vc4_nir_lower_txf_ms.c
Rhys Kidd [Sat, 19 Mar 2016 22:37:57 +0000 (18:37 -0400)]
vc4: Remove unused include from vc4_nir_lower_txf_ms.c

Found with grep and inspection. Test compiled on RPi hw.
Assists any future effort to remove TGSI as an intermediate stage.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
8 years agoglapi/glx: Treat xserver generated targets as .PHONY
Adam Jackson [Thu, 24 Mar 2016 17:57:57 +0000 (13:57 -0400)]
glapi/glx: Treat xserver generated targets as .PHONY

Meaning, always rebuild them when asked instead of bothering to look at
timestamps (and then wondering why nothing happened when you said make).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
8 years agoglapi/glx: Thunk non-ABI calls through GetProcAddress
Adam Jackson [Thu, 24 Mar 2016 17:57:57 +0000 (13:57 -0400)]
glapi/glx: Thunk non-ABI calls through GetProcAddress

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
8 years agoglapi/glx: Emit direct GL calls instead of dispatch lookup
Adam Jackson [Thu, 24 Mar 2016 17:57:57 +0000 (13:57 -0400)]
glapi/glx: Emit direct GL calls instead of dispatch lookup

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
8 years agoglx: Unbreak generating some of the xorg glx headers
Adam Jackson [Thu, 24 Mar 2016 17:57:57 +0000 (13:57 -0400)]
glx: Unbreak generating some of the xorg glx headers

Broken by:

    commit 9ace0b542241c77ae82a0835ac8a09e2a7510eaf
    Author: Dylan Baker <baker.dylan.c@gmail.com>
    Date:   Wed May 20 15:49:11 2015 -0700

glapi: glX_proto_size.py: use argparse instead of getopt

Which changed most, but not all, callers to use --header-tag instead of
-h.

Reviewed-by: Dylan Baker <baker.dylan.c@gmail.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
8 years agomesa/st: Fix NULL access if no fragment shader is bound
Bas Nieuwenhuizen [Mon, 28 Mar 2016 15:01:49 +0000 (17:01 +0200)]
mesa/st: Fix NULL access if no fragment shader is bound

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
8 years agofreedreno/ir3: fix for load_front_face intrinsic
Rob Clark [Mon, 21 Mar 2016 23:55:37 +0000 (19:55 -0400)]
freedreno/ir3: fix for load_front_face intrinsic

Seems like trying to widen in the same instruction as the add.s does a
non-sign-extending widen.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
8 years agofreedreno/ir3: fix compiler warn
Rob Clark [Sat, 6 Feb 2016 14:09:52 +0000 (09:09 -0500)]
freedreno/ir3: fix compiler warn

Signed-off-by: Rob Clark <robclark@freedesktop.org>
8 years agonvc0: make sure to disable fetches from previously-set VBOs when blitting
Ilia Mirkin [Mon, 28 Mar 2016 04:52:00 +0000 (00:52 -0400)]
nvc0: make sure to disable fetches from previously-set VBOs when blitting

We disable the vertex attributes, but also disable the VBO fetch details
as well, just in case. Not known to fix anything.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agonvc0: disable primitive restart and index bias during blits
Ilia Mirkin [Sun, 27 Mar 2016 02:32:43 +0000 (22:32 -0400)]
nvc0: disable primitive restart and index bias during blits

Back in the dawn of time, we used to do immediate uploads for the vertex
data, and all was well. However Maxwell dropped support for immediate
vertex data, so we started feeding in a VBO (in all cases). But we
forgot to disable some things that apply in such cases, specifically
primitive restart and index bias. The latter was causing WoW and other
Blizzard games trouble as they use a pattern where they draw with a base
vertex (aka index bias), followed by texture uploads (aka blits,
internally).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91526
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Karol Herbst <nouveau@karolherbst.de>
8 years agonvc0/ir: fix picking of coordinates from tex instruction for textureGrad
Ilia Mirkin [Sun, 20 Mar 2016 21:26:13 +0000 (17:26 -0400)]
nvc0/ir: fix picking of coordinates from tex instruction for textureGrad

On Fermi, there's an argument in front of the coords that combines array
and indirect handle, while on Kepler the array and the indirect handle
are separate (and in front of the coords). We were previously only
accounting for the array bit of it, if there were an indirect access it
wouldn't be counted in the formula.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
8 years agonv50/ir: saturate depth writes
Ilia Mirkin [Sun, 20 Mar 2016 17:11:01 +0000 (13:11 -0400)]
nv50/ir: saturate depth writes

Apparently there's no post-FS clamping logic, so we have to do this by
hand. The depth will never be outside of the 0..1 range, even on
floating point zeta buffers, so this should be safe.

Fixes dEQP-GLES3.functional.fbo.depth.*clamp.* which tests writing
invalid values on various zeta buffer formats.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agogallium/util: fix up inaccurate behavior of util_framebuffer_state_equal (v2)
Marek Olšák [Sun, 27 Mar 2016 17:11:09 +0000 (19:11 +0200)]
gallium/util: fix up inaccurate behavior of util_framebuffer_state_equal (v2)

v2: move the nr_cbufs check above the loop

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1)
8 years agost/mesa: only minify height if target != 1D array in st_finalize_texture
Marek Olšák [Mon, 21 Mar 2016 11:18:40 +0000 (12:18 +0100)]
st/mesa: only minify height if target != 1D array in st_finalize_texture

The st_texture_object documentation says:
  "the number of 1D array layers will be in height0"

We can't minify that.

Spotted by luck. No app is known to hit this issue.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agomesa: optimize out the realloc from glCopyTexImagexD()
Miklós Máté [Thu, 24 Mar 2016 00:13:02 +0000 (01:13 +0100)]
mesa: optimize out the realloc from glCopyTexImagexD()

v2: comment about the purpose of the code
v3: also compare texFormat,
 add a perf debug message,
 formatting fixes

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
8 years agost/mesa: fix handling the fallback texture
Miklós Máté [Thu, 24 Mar 2016 00:13:00 +0000 (01:13 +0100)]
st/mesa: fix handling the fallback texture

This fixes crash when post-processing is enabled in SW:KotOR.

v2: fix const-ness
v3: move assignment into the if() block

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
8 years agost/mesa: enable GL_ATI_fragment_shader
Miklós Máté [Thu, 24 Mar 2016 00:12:58 +0000 (01:12 +0100)]
st/mesa: enable GL_ATI_fragment_shader

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
8 years agost/mesa: implement GL_ATI_fragment_shader
Miklós Máté [Thu, 24 Mar 2016 00:12:57 +0000 (01:12 +0100)]
st/mesa: implement GL_ATI_fragment_shader

v2: fix arithmetic for special opcodes,
 fix fog state, cleanup
v3: simplify handling of special opcodes,
 fix rebinding with different textargets or fog equation,
 lots of formatting fixes
v4: adapt to the compile early, fix later architecture,
 formatting fixes

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
8 years agoprogram: add ATI_fragment_shader to shader stages list
Miklós Máté [Thu, 24 Mar 2016 00:12:56 +0000 (01:12 +0100)]
program: add ATI_fragment_shader to shader stages list

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
8 years agomesa: optionally associate a gl_program to ATI_fragment_shader
Miklós Máté [Thu, 24 Mar 2016 00:12:55 +0000 (01:12 +0100)]
mesa: optionally associate a gl_program to ATI_fragment_shader

the state tracker will use it

Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
8 years agogallium/p_context.h: Make comment more readable
Edward O'Callaghan [Sun, 27 Mar 2016 02:05:34 +0000 (13:05 +1100)]
gallium/p_context.h: Make comment more readable

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
8 years agomesa/st: Remove GLSLVersion clamping
Edward O'Callaghan [Sat, 26 Mar 2016 07:35:07 +0000 (18:35 +1100)]
mesa/st: Remove GLSLVersion clamping

While here, remove itermediate glsl_feature_level variable.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
8 years agoradeon/r600: Fix return type in failure branch
Edward O'Callaghan [Sat, 26 Mar 2016 07:35:06 +0000 (18:35 +1100)]
radeon/r600: Fix return type in failure branch

Commit `d4e847ea` introduced a warning about making an
integer from a pointer without a cast, fix it here.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
8 years agoradeon/r600_query.c: Minor style fix
Edward O'Callaghan [Sat, 26 Mar 2016 07:35:05 +0000 (18:35 +1100)]
radeon/r600_query.c: Minor style fix

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
8 years agovirgl: drop next shader property for now.
Dave Airlie [Wed, 23 Mar 2016 23:28:49 +0000 (09:28 +1000)]
virgl: drop next shader property for now.

Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoglsl: reduce buffer block duplication
Timothy Arceri [Thu, 24 Mar 2016 01:11:01 +0000 (12:11 +1100)]
glsl: reduce buffer block duplication

This reduces some of the craziness required for handling buffer
blocks. The problem is each shader stage holds its own information
about a block in memory, we were copying that information to a
program wide list but the per stage information remained meaning
when a binding was updated we needed to update all versions of it.

This changes the per stage blocks to instead point to a single
version of the block information in the program list.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agost/xa: emit sampler view declarations in shaders
Brian Paul [Fri, 25 Mar 2016 20:06:39 +0000 (14:06 -0600)]
st/xa: emit sampler view declarations in shaders

Fixes recent regressions with the VMware gallium driver.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Tested-by: Charmaine Lee <charmainel@vmware.com>
8 years agoswr: [rasterizer jitter] Fix MASKLOADD AVX prototype (float -> i32)
Tim Rowley [Thu, 24 Mar 2016 17:52:51 +0000 (11:52 -0600)]
swr: [rasterizer jitter] Fix MASKLOADD AVX prototype (float -> i32)

8 years agoswr: [rasterizer core] NUMA optimizations...
Tim Rowley [Thu, 24 Mar 2016 06:01:23 +0000 (00:01 -0600)]
swr: [rasterizer core] NUMA optimizations...

- Affinitize hot-tile memory to specific NUMA nodes.
- Only do BE work for macrotiles assoicated with the numa node

8 years agoswr: [rasterizer jitter] Fix logic bug for alpha-to-coverage.
Tim Rowley [Thu, 24 Mar 2016 00:12:11 +0000 (18:12 -0600)]
swr: [rasterizer jitter] Fix logic bug for alpha-to-coverage.

8 years agoswr: [rasterizer core] Fix Compute workitem retirement
Tim Rowley [Tue, 22 Mar 2016 23:28:06 +0000 (17:28 -0600)]
swr: [rasterizer core] Fix Compute workitem retirement

8 years agoswr: [rasterizer core] Cleanup state ring arena after last draw that references it...
Tim Rowley [Tue, 22 Mar 2016 21:13:29 +0000 (15:13 -0600)]
swr: [rasterizer core] Cleanup state ring arena after last draw that references it completes

Rather than waiting for the API thread to re-use it.

8 years agoswr: [rasterizer jitter] add missing include for llvm jitevents
Tim Rowley [Tue, 22 Mar 2016 18:41:13 +0000 (12:41 -0600)]
swr: [rasterizer jitter] add missing include for llvm jitevents

8 years agoswr: [rasterizer core] Reduce Arena blocksize to 128KB (from 1MB).
Tim Rowley [Tue, 22 Mar 2016 15:27:18 +0000 (09:27 -0600)]
swr: [rasterizer core] Reduce Arena blocksize to 128KB (from 1MB).

With global allocator this doesn't seem to affect performance at all.
Overall memory consumption drops by up to 85%.

8 years agoswr: [rasterizer core] One last pass at Arena optimizations
Tim Rowley [Mon, 21 Mar 2016 23:55:46 +0000 (17:55 -0600)]
swr: [rasterizer core] One last pass at Arena optimizations

8 years agoswr: [rasterizer core] CachedArena optimizations
Tim Rowley [Mon, 21 Mar 2016 23:30:03 +0000 (17:30 -0600)]
swr: [rasterizer core] CachedArena optimizations

Reduce list traversal during Alloc and Free.

Add ability to have multiple lists based on alloc size (not used for now)

8 years agoswr: [rasterizer jitter] support llvm-svn
Tim Rowley [Mon, 21 Mar 2016 20:08:38 +0000 (14:08 -0600)]
swr: [rasterizer jitter] support llvm-svn

8 years agoswr: [rasterizer core] Globally cache allocated arena blocks for fast re-allocation.
Tim Rowley [Mon, 21 Mar 2016 17:15:32 +0000 (11:15 -0600)]
swr: [rasterizer core] Globally cache allocated arena blocks for fast re-allocation.

8 years agoswr: [rasterizer] more arena work
Tim Rowley [Fri, 18 Mar 2016 18:11:20 +0000 (12:11 -0600)]
swr: [rasterizer] more arena work

8 years agoswr: [rasterizer core] Add clipping against user clip distances in the NullPS backend.
Tim Rowley [Fri, 18 Mar 2016 17:48:47 +0000 (11:48 -0600)]
swr: [rasterizer core] Add clipping against user clip distances in the NullPS backend.

8 years agoswr: [rasterizer core] Arena optimizations - preparing for global allocator.
Tim Rowley [Fri, 18 Mar 2016 00:10:25 +0000 (18:10 -0600)]
swr: [rasterizer core] Arena optimizations - preparing for global allocator.

8 years agoswr: [rasterizer core] Reset DrawContext arena at end of draw rather than upon reclai...
Tim Rowley [Thu, 17 Mar 2016 22:50:46 +0000 (16:50 -0600)]
swr: [rasterizer core] Reset DrawContext arena at end of draw rather than upon reclaim of DC

Keeps overall memory consumption lower.
Also, remove unused knobs.

8 years agoswr: [rasterizer core] Add clipping of user clip planes in clipper.
Tim Rowley [Thu, 17 Mar 2016 22:12:17 +0000 (16:12 -0600)]
swr: [rasterizer core] Add clipping of user clip planes in clipper.

8 years agoswr: [rasterizer] Reduce max in-flight draws to 96 (by default)
Tim Rowley [Thu, 17 Mar 2016 21:39:13 +0000 (15:39 -0600)]
swr: [rasterizer] Reduce max in-flight draws to 96 (by default)

8 years agoswr: [rasterizer] Fix run-time check asserts
Tim Rowley [Thu, 17 Mar 2016 18:22:43 +0000 (12:22 -0600)]
swr: [rasterizer] Fix run-time check asserts

One innocuous (uninitialized variable), and one not so innocuous
(stack corruption).

8 years agoswr: [rasterizer jitter] signed immediate builder
Tim Rowley [Wed, 16 Mar 2016 23:54:04 +0000 (17:54 -0600)]
swr: [rasterizer jitter] signed immediate builder

8 years agoswr: [rasterizer common] changes for cygwin
Tim Rowley [Wed, 16 Mar 2016 17:56:50 +0000 (11:56 -0600)]
swr: [rasterizer common] changes for cygwin

8 years agoswr: [rasterizer] code styling and update copyrights
Tim Rowley [Mon, 14 Mar 2016 21:54:29 +0000 (15:54 -0600)]
swr: [rasterizer] code styling and update copyrights

8 years agoswr: [rasterizer core] Guard against enquing work to invalid hot tiles
Tim Rowley [Fri, 11 Mar 2016 01:20:07 +0000 (19:20 -0600)]
swr: [rasterizer core] Guard against enquing work to invalid hot tiles

8 years agoswr: [rasterizer] Stop setting viewport size to larger than hottile array
Tim Rowley [Fri, 11 Mar 2016 01:19:30 +0000 (19:19 -0600)]
swr: [rasterizer] Stop setting viewport size to larger than hottile array

Guard against enquing work to invalid tiles

8 years agoswr: [rasterizer] Discard work + misc fixes
Tim Rowley [Fri, 11 Mar 2016 00:30:40 +0000 (18:30 -0600)]
swr: [rasterizer] Discard work + misc fixes

8 years agoswr: [rasterizer] remove use of BYTE type
Tim Rowley [Thu, 10 Mar 2016 21:15:40 +0000 (15:15 -0600)]
swr: [rasterizer] remove use of BYTE type

8 years agoswr: [rasterizer core] Fix crash that can occur when switching contexts
Tim Rowley [Wed, 9 Mar 2016 23:18:55 +0000 (17:18 -0600)]
swr: [rasterizer core] Fix crash that can occur when switching contexts

8 years agoswr: [rasterizer] remove unused knob
Tim Rowley [Wed, 9 Mar 2016 22:33:33 +0000 (16:33 -0600)]
swr: [rasterizer] remove unused knob

8 years agoswr: [rasterizer core] subcontext rework
Tim Rowley [Wed, 9 Mar 2016 22:15:37 +0000 (16:15 -0600)]
swr: [rasterizer core] subcontext rework

8 years agoswr: [rasterizer common] add _simd_s[rl]lv_epi32
Tim Rowley [Wed, 9 Mar 2016 00:58:54 +0000 (18:58 -0600)]
swr: [rasterizer common] add _simd_s[rl]lv_epi32

8 years agoswr: [rasterizer core] Alleviate potential stack overflow for 32bit builds
Tim Rowley [Tue, 8 Mar 2016 17:56:06 +0000 (11:56 -0600)]
swr: [rasterizer core] Alleviate potential stack overflow for 32bit builds

Move large stack allocations in the GS and clipper into thread local storage.

8 years agoswr: [rasterizer] remove use of UCHAR and UINT64 types
Tim Rowley [Mon, 7 Mar 2016 20:45:17 +0000 (14:45 -0600)]
swr: [rasterizer] remove use of UCHAR and UINT64 types

8 years agoswr: [rasterizer] remove use of FLOAT type
Tim Rowley [Mon, 7 Mar 2016 16:51:56 +0000 (10:51 -0600)]
swr: [rasterizer] remove use of FLOAT type

8 years agoswr: [rasterizer] Fix Coverity issues reported by Mesa developers.
Tim Rowley [Mon, 7 Mar 2016 07:14:13 +0000 (01:14 -0600)]
swr: [rasterizer] Fix Coverity issues reported by Mesa developers.

8 years agoswr: [rasterizer] add debug/perf category to knobs
Tim Rowley [Sat, 5 Mar 2016 06:53:04 +0000 (00:53 -0600)]
swr: [rasterizer] add debug/perf category to knobs

8 years agoswr: [rasterizer core] don't assume linux is 64-bit
Tim Rowley [Thu, 24 Mar 2016 16:07:32 +0000 (11:07 -0500)]
swr: [rasterizer core] don't assume linux is 64-bit

8 years agoswr: [rasterizer common] remove old unused win32 types
Tim Rowley [Thu, 24 Mar 2016 16:07:15 +0000 (11:07 -0500)]
swr: [rasterizer common] remove old unused win32 types

8 years agoswr: [rasterizer jitter] vpermps support
Tim Rowley [Fri, 4 Mar 2016 00:19:45 +0000 (18:19 -0600)]
swr: [rasterizer jitter] vpermps support

8 years agoswr: [rasterizer] Add rdtsc buckets support for shaders
Tim Rowley [Mon, 29 Feb 2016 18:01:48 +0000 (12:01 -0600)]
swr: [rasterizer] Add rdtsc buckets support for shaders

Pass pointer to core buckets mgr back to sim layer.

Add support for RDTSC_START/RDTSC_STOP macros in the builder.

Each unique shader now has a unique bucket associated with it,
enabling more detailed reporting at the shader level. Currently
due to some llvm issue with thread local storage, 64bit runs require
single threaded mode.

8 years agoswr: [rasterizer core] backend reorganization
Tim Rowley [Wed, 24 Feb 2016 19:34:50 +0000 (13:34 -0600)]
swr: [rasterizer core] backend reorganization

8 years agoswr: [rasterizer core] store blend output in temporary instead of PS output.
Tim Rowley [Thu, 25 Feb 2016 01:03:33 +0000 (19:03 -0600)]
swr: [rasterizer core] store blend output in temporary instead of PS output.

Fixes additive blend problem with MSAA

8 years agoswr: [rasterizer core] Move InitializeHotTiles and corresponding clear code out of...
Tim Rowley [Tue, 23 Feb 2016 23:29:59 +0000 (17:29 -0600)]
swr: [rasterizer core] Move InitializeHotTiles and corresponding clear code out of threads.cpp.

8 years agoswr: [rasterizer jitter] Cleanup use of types inside of Builder.
Tim Rowley [Tue, 23 Feb 2016 19:47:24 +0000 (13:47 -0600)]
swr: [rasterizer jitter] Cleanup use of types inside of Builder.

Also, cached the simd width since we don't have to keep querying
the JitManager for it.

8 years agoswr: [rasterizer jitter] Fix type mismatch on select args for SCATTERPS
Tim Rowley [Mon, 22 Feb 2016 17:00:07 +0000 (11:00 -0600)]
swr: [rasterizer jitter] Fix type mismatch on select args for SCATTERPS

8 years agoswr: [rasterizer core] fix rasterizing multisampling with scissor enabled
Tim Rowley [Sat, 20 Feb 2016 01:05:14 +0000 (19:05 -0600)]
swr: [rasterizer core] fix rasterizing multisampling with scissor enabled

We were not evaluating the scissor edge equations at sample positions.

8 years agoswr: [rasterizer core] RingBuffer class for DC/DS
Tim Rowley [Fri, 19 Feb 2016 23:55:23 +0000 (17:55 -0600)]
swr: [rasterizer core] RingBuffer class for DC/DS

Use head/tail ring buffer indices for thread synchronization.

1. SwrWaitForIdle loops until ring is empty. (head == tail)
2. GetDrawContext waits until ring is not full. (head - tail) == Ring Size
3. Draw enqueues by incrementing head.
4. Last worker thread to move past a DC dequeues by incrementing tail.

Todo: To reduce contention we can cache the tail in the API thread. For
example, if you know you have 64 free entries in the ring then you don't
need to keep checking the tail until you used those 64 entries.