mesa.git
8 years agoglsl: Add support for representing framebuffer fetch in the GLSL IR.
Francisco Jerez [Wed, 20 Jul 2016 03:07:47 +0000 (20:07 -0700)]
glsl: Add support for representing framebuffer fetch in the GLSL IR.

The GLSL IR representation of framebuffer fetch amounts to a single
bit in the ir_variable object applicable to fragment shader outputs.
The flag indicates that the variable will be implicitly initialized to
the previous contents of the render buffer at the same fragment
coordinates and sample index.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agoglsl: Add parser state enables for the framebuffer fetch extensions.
Francisco Jerez [Tue, 26 Jul 2016 00:24:52 +0000 (17:24 -0700)]
glsl: Add parser state enables for the framebuffer fetch extensions.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agomesa: Add blend barrier entry point and driver hook.
Francisco Jerez [Wed, 6 Jul 2016 06:21:21 +0000 (23:21 -0700)]
mesa: Add blend barrier entry point and driver hook.

Both MESA_shader_framebuffer_fetch_non_coherent and the non-coherent
variant of KHR_blend_equation_advanced will use this driver hook to
request coherency between framebuffer reads and writes.  This
intentionally doesn't hook up glBlendBarrierMESA to the dispatch layer
since the extension isn't exposed to applications yet, see [1]
for more details.

[1] https://lists.freedesktop.org/archives/mesa-dev/2016-July/124028.html

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agomesa: Move shader memory barrier functions into barrier.c.
Francisco Jerez [Wed, 6 Jul 2016 06:18:18 +0000 (23:18 -0700)]
mesa: Move shader memory barrier functions into barrier.c.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agomesa: Rename "texturebarrier" source files to "barrier".
Francisco Jerez [Wed, 6 Jul 2016 06:15:01 +0000 (23:15 -0700)]
mesa: Rename "texturebarrier" source files to "barrier".

In preparation for collecting all pipeline barrier GL entry points
into a single source file.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agomesa: Add support for querying GL_FRAGMENT_SHADER_DISCARDS_SAMPLES_EXT.
Francisco Jerez [Wed, 6 Jul 2016 04:28:11 +0000 (21:28 -0700)]
mesa: Add support for querying GL_FRAGMENT_SHADER_DISCARDS_SAMPLES_EXT.

This can currently only give true as result since the only way you can
expose EXT_shader_framebuffer_fetch right now is by flipping the
MESA_shader_framebuffer_fetch bit, but that could potentially change
in the future, see [1] for an explanation.

[1] https://lists.freedesktop.org/archives/mesa-dev/2016-July/124028.html

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agomesa: Add extension enables for framebuffer fetch extensions.
Francisco Jerez [Wed, 20 Jul 2016 00:40:05 +0000 (17:40 -0700)]
mesa: Add extension enables for framebuffer fetch extensions.

This allows drivers to expose EXT_shader_framebuffer_fetch in GLES2+
contexts if desired.  Note that this adds boolean flags for two MESA
extensions, but only the EXT GLES-only extension is exposed for the
moment, see the cover letter of this series [1] for the rationale.

[1] https://lists.freedesktop.org/archives/mesa-dev/2016-July/124028.html

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agoglapi: Add XML for GL_EXT_shader_framebuffer_fetch.
Francisco Jerez [Wed, 6 Jul 2016 04:25:56 +0000 (21:25 -0700)]
glapi: Add XML for GL_EXT_shader_framebuffer_fetch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agonvc0: invalidate textures/samplers on GK104+
Samuel Pitoiset [Wed, 24 Aug 2016 18:22:52 +0000 (20:22 +0200)]
nvc0: invalidate textures/samplers on GK104+

Like Fermi, textures and samplers are aliased between 3D and compute,
especially the TIC_FLUSH/TSC_FLUSH methods and we have to re-validate
these resources when switching between the two pipelines.

This fixes a GPU hang with Elemental (and most likely with other UE4 demos).

Tested on GK107 and GM107.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
CC: <mesa-stable@lists.freedesktop.org>
8 years agogallium/ttn: Remove duplicated TGSI_OPCODE_DP2A initialization
Rhys Kidd [Wed, 24 Aug 2016 04:13:04 +0000 (00:13 -0400)]
gallium/ttn: Remove duplicated TGSI_OPCODE_DP2A initialization

Duplicate line is currently on 1535.

Identified by Clang, when run through Eric Anholt's Travis harness.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
8 years agotravis: Upgrade LLVM dependency to 3.5 and enable LLVM drivers.
Eric Anholt [Thu, 18 Aug 2016 21:10:57 +0000 (14:10 -0700)]
travis: Upgrade LLVM dependency to 3.5 and enable LLVM drivers.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
8 years agotravis: Enable vc4 in libdrm to satisfy vc4 test build dependency.
Eric Anholt [Thu, 18 Aug 2016 20:43:12 +0000 (13:43 -0700)]
travis: Enable vc4 in libdrm to satisfy vc4 test build dependency.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
8 years agotravis: Update to the Ubuntu Trusty image.
Eric Anholt [Thu, 18 Aug 2016 20:12:18 +0000 (13:12 -0700)]
travis: Update to the Ubuntu Trusty image.

This will hopefully fix wget from x.org (no real reason explained in
Travis CI bug reports), and may also mean that we can enable LLVM driver
builds.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
8 years agotravis: Parse configure.ac to pick an updated LIBDRM_VERSION.
Eric Anholt [Thu, 18 Aug 2016 19:29:31 +0000 (12:29 -0700)]
travis: Parse configure.ac to pick an updated LIBDRM_VERSION.

Travis has been broken a couple of times by configure.ac updates.  To make
it useful, auto-update the version necessary.

This could potentially be used for other dependencies, too, but those get
bumped less frequently.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
8 years agoanv: meta_blit2d: adapt texel fetch pitch for fake w-tiled
Lionel Landwerlin [Wed, 24 Aug 2016 16:52:12 +0000 (17:52 +0100)]
anv: meta_blit2d: adapt texel fetch pitch for fake w-tiled

We need to compute detiling coordinates using the physical size of W tiling
(128x32) rather than the logical size (64x64).

v2: Correct comment (Jason)

Fixes dEQP-VK.api.copy_and_blit.image_to_image_stencil

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97448
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agovc4: Fix GPU hangs with >16 varying values.
Eric Anholt [Mon, 22 Aug 2016 21:58:28 +0000 (14:58 -0700)]
vc4: Fix GPU hangs with >16 varying values.

Fixes glsl-routing in piglit and hangs in glbenchmark 2.0.2.

8 years agovl/rbsp: fix another three byte not detected
Leo Liu [Mon, 22 Aug 2016 16:05:53 +0000 (12:05 -0400)]
vl/rbsp: fix another three byte not detected

This happens when three byte "00 00 03" is partly loaded to
vlc->buffer, thus at the bottom of buffer with valid bits is
"00" or "00 00" and left  like "00 03" or "03" in the data,
so that it will not be detected by three byte emulation check.
The reason for that is the escaped bit was set to 0 from the
rbsp init.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
8 years agoradeonsi: fix VM faults due NULL internal const buffers on CIK
Marek Olšák [Thu, 18 Aug 2016 13:25:51 +0000 (15:25 +0200)]
radeonsi: fix VM faults due NULL internal const buffers on CIK

They are harmless, but the interrupts do decrease performance.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97039

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
8 years agogallium/winsys/kms: Look up the GEM handle after importing a prime FD
Tomasz Figa [Tue, 2 Aug 2016 10:46:28 +0000 (19:46 +0900)]
gallium/winsys/kms: Look up the GEM handle after importing a prime FD

drmPrimeHandleToFD() will return the same GEM handle every time the same
buffer is imported, even from a different prime FD. Since GEM handles
are not reference counted, we need to make sure that each GEM handle is
referenced only by one display target struct, by looking it up in
kms_sw->bo_list first and bumping the refcount of the found dt on hit
and falling back to creating a new dt only on miss.

v2: Split into separate function.
    Use helper function for lookup.

v3 [Emil Velikov]:
    Rename kms_sw_displaytarget_{lookup,find_and_ref} (Jordan)

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com> (v2)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
8 years agogallium/winsys/kms: Move display target handle lookup to separate function
Tomasz Figa [Tue, 2 Aug 2016 10:46:27 +0000 (19:46 +0900)]
gallium/winsys/kms: Move display target handle lookup to separate function

As a preparation to use the lookup in more than once place, move the
code that looks up given KMS/GEM handle to a separate function. This
change should not introduce any functional changes.

v2: Split into separate patch.
    Move lookup code into separate function.

v3 [Emil Velikov]:
    Rename kms_sw_displaytarget_{lookup,find_and_ref} (Jordan)

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com> (v2)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
8 years agogallium/winsys/kms: Fully initialize kms_sw_dt at prime import time (v2)
Tomasz Figa [Tue, 2 Aug 2016 10:46:26 +0000 (19:46 +0900)]
gallium/winsys/kms: Fully initialize kms_sw_dt at prime import time (v2)

Currently kms_sw_displaytarget_add_from_prime() allocates the struct and
fills in only some of the fields, resulting in a half-baked struct that
needs to be further completed by the caller. To make this a bit more
consistent, pass width, height and stride to this function and fill in
everything there, so that caller can take the returned struct as is.

v2: Split from one big patch into four fixing one thing at a time.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
8 years agogallium/winsys/kms: Fix double refcount when importing from prime FD (v2)
Tomasz Figa [Tue, 2 Aug 2016 10:46:25 +0000 (19:46 +0900)]
gallium/winsys/kms: Fix double refcount when importing from prime FD (v2)

Currently the code creates a display target struct with refcount field
initialized to 1 and then the caller again increments it, leading to
a leaked reference. Let's remove the unnecessary increment.

v2: Split from one big patch into four fixing one thing at a time.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
8 years agoshaderapi: don't generate not linked error on GetProgramStage in general
Alejandro Piñeiro [Tue, 23 Aug 2016 15:00:54 +0000 (17:00 +0200)]
shaderapi: don't generate not linked error on GetProgramStage in general

Both ARB_shader_subroutine and the GL core spec doesn't list any
error when the program is not linked.

We left a error generation for the uniform location, in order to be
consistent with other methods from the spec that generate them.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
8 years agogallium/cso: avoid unnecessary null dereference
Eric Engestrom [Tue, 12 Jul 2016 21:48:28 +0000 (22:48 +0100)]
gallium/cso: avoid unnecessary null dereference

The label `out:` calls `destroy()` which dereferences `ctx`.
This is unnecessary as there is nothing to destroy.
Immediately return instead.

CovID: 1258255
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years ago.gitignore: Ignore tags generated by `make tags`
Eric Engestrom [Tue, 31 May 2016 01:30:16 +0000 (02:30 +0100)]
.gitignore: Ignore tags generated by `make tags`

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
[Emil Velikov: rebase]
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
8 years agost/xvmc: fix a couple 'unused-but-set-variable' warnings
Eric Engestrom [Tue, 12 Jul 2016 22:41:50 +0000 (23:41 +0100)]
st/xvmc: fix a couple 'unused-but-set-variable' warnings

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
8 years agoegl: turn a couple asserts static (compile-time)
Eric Engestrom [Mon, 22 Aug 2016 20:52:03 +0000 (21:52 +0100)]
egl: turn a couple asserts static (compile-time)

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
8 years agoi915: remove unnecessary `if`
Eric Engestrom [Mon, 15 Aug 2016 14:51:21 +0000 (15:51 +0100)]
i915: remove unnecessary `if`

if (x) return true; else return false;
can be simplified as:
return x;
since `x` is already a boolean expression.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
8 years agoi965: remove unnecessary `if`
Eric Engestrom [Mon, 15 Aug 2016 14:51:20 +0000 (15:51 +0100)]
i965: remove unnecessary `if`

if (x) return true; else return false;
can be simplified as:
return x;
since both `x` are already boolean expressions.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
8 years agoprogram_resource: subroutine active uniforms should return NumSubroutineUniforms
Alejandro Piñeiro [Thu, 18 Aug 2016 17:44:55 +0000 (19:44 +0200)]
program_resource: subroutine active uniforms should return NumSubroutineUniforms

Before this commit, GetProgramInterfaceiv for pname ACTIVE_RESOURCES
and all the <shader>_SUBROUTINE_UNIFORM programInterface were
returning the count of resources on the shader program using that
interface, instead of the num of uniform resources. This would get a
wrong value (for example) if the shader has an array of subroutine
uniforms.

Note that this means that in order to get a proper value, the shader
needs to be linked, something that is not explicitly mentioned on
ARB_program_interface_query spec, but comes from the general
definition of active uniform. If the program is not linked we
return 0.

v2: don't generate an error if the program is not linked, returning 0
    active uniforms instead, plus extra spec references (Tapani Palli)

Fixes GL44-CTS.program_interface_query.subroutines-compute

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
8 years agoegl/wayland-egl: Fix for segfault in dri2_wl_destroy_surface.
Stencel, Joanna [Mon, 22 Aug 2016 07:48:50 +0000 (09:48 +0200)]
egl/wayland-egl: Fix for segfault in dri2_wl_destroy_surface.

Segfault occurs when destroying EGL surface attached to already destroyed
Wayland window. The fix is to set to NULL the pointer of surface's
native window when wl_egl_destroy_window() is called.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Stencel, Joanna <joanna.stencel@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
8 years agost/va: Remove unused variable coded_size from vlVaEndPicture()
Kai Wasserbäch [Sat, 20 Aug 2016 16:14:54 +0000 (18:14 +0200)]
st/va: Remove unused variable coded_size from vlVaEndPicture()

Removes the following GCC warning:
 ../../../../../src/gallium/state_trackers/va/picture.c:542:17: warning:
  unused variable 'coded_size' [-Wunused-variable]
    unsigned int coded_size;
                 ^~~~~~~~~~

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
8 years agost/va: Remove else case in vlVaEndPicture() made superfluous by c59628d11b
Kai Wasserbäch [Sat, 20 Aug 2016 16:14:53 +0000 (18:14 +0200)]
st/va: Remove else case in vlVaEndPicture() made superfluous by c59628d11b

Commit c59628d11b134fc016388a170880f7646e100d6f made the else statement
and duplication of the context->decoder->end_frame() call superfluous.

Cc: Boyuan Zhang <boyuan.zhang@amd.com>
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
8 years agost/va: add missing mutex_unlock
Eric Engestrom [Sun, 21 Aug 2016 21:11:48 +0000 (22:11 +0100)]
st/va: add missing mutex_unlock

Fixes: c59628d11b134fc01638 ("st/va: enable dual instances encode by sync surface")
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
8 years agoaubinator: Style fixes.
Kenneth Graunke [Tue, 23 Aug 2016 23:35:59 +0000 (16:35 -0700)]
aubinator: Style fixes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agoaubinator: Fix the tool to correctly decode the DWords
Sirisha Gandikota [Fri, 19 Aug 2016 19:13:25 +0000 (12:13 -0700)]
aubinator: Fix the tool to correctly decode the DWords

Several fixes have been added as part of this as listed below:

1) Fix the mask and add disassembler handling for STATE_DS, STATE_HS
as the mask returned wrong values of the fields.

2) Fix the GEN_TYPE_ADDRESS/GEN_TYPE_OFFSET decoding - the address/
offset were handled the same way as the other fields and that gives
the wrong values for the address/offset.

3) Decode nested/recurssive structures - Many packets contain nested
structures, ex: 3DSATE_SO_BUFFER, STATE_BASE_ADDRESS, etc contain MOC
structures. Previously, the aubinator printed 1 if there was a MOC
structure. Now we decode the entire structure and print out its fields.

4) Print out the DWord address along with its hex value - For a better
clarity of information, it is helpful to print both the address and
hex value of the DWord along with the DWord count. Since the DWord0
contains the instruction code and the instruction length, it is
unnecessary to print the decoded values for DWord0. This information
is already available from the DWord hex value.

5) Decode the <group> and the corresponding fields in the group- The
<group> tag can have fields of several types including structures. A
group can contain one or more number of fields and this has be correctly
decoded. Previously, aubinator did not decode the groups or the
fields/structures inside them. Now we decode the <group> in the
instructions and structures where the fields in it repeat for any number
of times specified.

v2: Fix the formatting (per Matt)
Make the start and end pos calculation to extract fields from a DWord
more appropriate by moving %32 away from mask() method

Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
8 years agoaubinator: Add a new tool called Aubinator to the src/intel/tools folder.
Kristian Høgsberg Kristensen [Fri, 19 Aug 2016 19:13:24 +0000 (12:13 -0700)]
aubinator: Add a new tool called Aubinator to the src/intel/tools folder.

The Aubinator tool is designed to help the driver developers in debugging
the driver functionality by decoding the data in the .aub files.
Primary Authors of this tool are Damien Lespiau <damien.lespiau at intel.com>
and Kristian Høgsberg Kristensen <krh at bitplanet.net>.

v2: Review comments are incorporated by Sirisha Gandikota as below:
1) Make Makefile.am more crisp, reuse intel_aub.h from libdrm (per Emil)
2) Aubinator will use platform name instead of GEN number (per Matt)
3) Disassmebler gets created based on pciid rather then GEN number (per Matt)
4) Other formatting comments (per Ken, Matt and Emil)

Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
8 years agoglsl: Mark tessellation qualifier maps static const.
Kenneth Graunke [Mon, 22 Aug 2016 23:39:14 +0000 (16:39 -0700)]
glsl: Mark tessellation qualifier maps static const.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agoisl/formats: Integer formats are not filterable
Jason Ekstrand [Tue, 23 Aug 2016 21:11:18 +0000 (14:11 -0700)]
isl/formats: Integer formats are not filterable

In ca2a8e56285, we updated the format table to add more formats (most of
which are new on SKL) but accidentally marked some integer formats as
filterable.  You can't filter an integer format.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
8 years agost/dri: respect driver's request to avoid mixed color/depth bit configs
Ilia Mirkin [Sun, 21 Aug 2016 02:42:45 +0000 (22:42 -0400)]
st/dri: respect driver's request to avoid mixed color/depth bit configs

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agogallium: add a cap to expose whether driver supports mixed color/zs bits
Ilia Mirkin [Sun, 21 Aug 2016 02:40:33 +0000 (22:40 -0400)]
gallium: add a cap to expose whether driver supports mixed color/zs bits

Some hardware can't render to color/depth buffers of mixed bitness. When
that happens a fallback has to happen, but this allows the driver to
express that this isn't an optimal scenario. The purpose of this is to
remove such fbconfigs from the GLX/EGL config list.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agodri: add a way to request that modes have matching color/zs depths
Ilia Mirkin [Sat, 20 Aug 2016 20:10:20 +0000 (16:10 -0400)]
dri: add a way to request that modes have matching color/zs depths

Some GPUs, notably nv3x/nv4x can't render to mismatched color/zs
framebuffer depths. Fallbacks can be done by the driver, with shadow
surfaces, but no reason to encourage applications to select non-matching
glx visuals.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agonv50/ir: make sure cfg iterator always hits all blocks
Ilia Mirkin [Fri, 19 Aug 2016 04:41:59 +0000 (00:41 -0400)]
nv50/ir: make sure cfg iterator always hits all blocks

In some very specially-crafted cases, we could attempt to visit a node
that has already been visited, and then run out of bb's to visit, while
there were still cross blocks on the list. Make sure that those get
moved over in that case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96274
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
8 years agoanv/clear: Clear E5B9G9R9 images as R32_UINT
Jason Ekstrand [Wed, 3 Aug 2016 18:41:45 +0000 (11:41 -0700)]
anv/clear: Clear E5B9G9R9 images as R32_UINT

We can't actually clear these images normally because we can't render to
them.  Instead, we have to manually unpack the rgb9e5 color value on the
CPU and clear it as R32_UINT.  We still have a bit of work to do to clear
non-power-of-two images, but this should get all of the power-of-two clears
working on at least Haswell.  This fixes three of the new Vulkan CTS tests
in the dEQP-VK.api.image_clearing.clear_color_image.* group.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agoanv/clear: Make cmd_clear_image take an actual VkClearValue
Jason Ekstrand [Wed, 3 Aug 2016 18:37:24 +0000 (11:37 -0700)]
anv/clear: Make cmd_clear_image take an actual VkClearValue

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agoanv/blit2d: Add support for RGB destinations
Jason Ekstrand [Tue, 2 Aug 2016 15:47:51 +0000 (08:47 -0700)]
anv/blit2d: Add support for RGB destinations

This fixes 104 of the new image_clearing and copy_and_blit Vulkan CTS
tests.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agoanv/blit2d: Add a format parameter to bind_dst and create_iview
Jason Ekstrand [Tue, 2 Aug 2016 15:28:39 +0000 (08:28 -0700)]
anv/blit2d: Add a format parameter to bind_dst and create_iview

Signed-off-by: Jasosn Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agoanv/image: Don't create invalid render target surfaces
Jason Ekstrand [Tue, 26 Jul 2016 22:29:40 +0000 (15:29 -0700)]
anv/image: Don't create invalid render target surfaces

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoisl/formats: Update the table with more samplable formats
Jason Ekstrand [Wed, 27 Jul 2016 00:31:11 +0000 (17:31 -0700)]
isl/formats: Update the table with more samplable formats

There were a lot of formats where support was added on Haswell or later but
we never updated the format table.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
8 years agoisl/formats: Report ETC as being samplable on Bay Trail
Jason Ekstrand [Wed, 27 Jul 2016 00:32:01 +0000 (17:32 -0700)]
isl/formats: Report ETC as being samplable on Bay Trail

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
8 years agoi965/surface_formats: Don't advertise 8 or 16-bit RGB formats
Jason Ekstrand [Wed, 27 Jul 2016 03:59:38 +0000 (20:59 -0700)]
i965/surface_formats: Don't advertise 8 or 16-bit RGB formats

We have implicitly been not advertising these formats since we had them
turned off in the format capabilities table.  We are about to update that
table and this prevents a change in behavior.  The only change in behavior
created by this patch is that we no longer advertise support for
R16G16B16_FLOAT which means that it's now renderable which seems like a
bonus.  Maybe someday we'll want to change things to start supporting
16-bit RGB formats natively but, at the moment, there's no need.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
8 years agoanv/formats: Don't use an RGBX format if it isn't renderable
Jason Ekstrand [Tue, 26 Jul 2016 18:33:45 +0000 (11:33 -0700)]
anv/formats: Don't use an RGBX format if it isn't renderable

The whole point of using RGBX is so that we can render to it so if it isn't
renderable, that kind-of defeats the purpose.  Some formats (one example is
R32G32B32X32_SFLOAT) exist in the format table but aren't actually
renderable.  Eventually, we'd like to get away from RGBX entirely, but this
fixes hangs on BDW today.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoegl/dri2: dri2_initialize: Do not reference-count TestOnly display
Nicolas Boichat [Wed, 3 Aug 2016 13:54:22 +0000 (21:54 +0800)]
egl/dri2: dri2_initialize: Do not reference-count TestOnly display

In the case where dri2_initialize is called with a TestOnly display,
the display is not actually initialized, so dri2_egl_display always
fails, and we cannot do any reference counting.

Fixes piglit spec@egl_khr_create_context@verify gl flavor (reproducible
with LIBGL_ALWAYS_SOFTWARE=1).

Fixes: 9ee683f877 (egl/dri2: Add reference count for dri2_egl_display)
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reported-by: Michel Dänzer <michel@daenzer.net>
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
8 years agovbo: fix format string compiler warning for 32-bit machines
Jan Ziak [Tue, 2 Aug 2016 14:40:00 +0000 (08:40 -0600)]
vbo: fix format string compiler warning for 32-bit machines

Signed-off-by: Jan Ziak (http://atom-symbol.net) <0xe2.0x9a.0x9b@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agoegl/dri2: remove error checks on return values from mtx_lock and cnd_wait
Dongwon Kim [Mon, 15 Aug 2016 22:12:03 +0000 (15:12 -0700)]
egl/dri2: remove error checks on return values from mtx_lock and cnd_wait

This removes unnecessary error checks on return result of mtx_lock
and cnd_wait calls as in all other places in MESA source since there
is no chance that any of these functions return any of error codes
in current implementation.

This patch also removes a redundent _eglError call that follows
EGL_FALSE check in the bottom of dri2_client_wait_sync.

Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
8 years agoi965: report bound buffer size not underlying buffer size for image size (v2)
Dave Airlie [Fri, 27 May 2016 05:00:57 +0000 (15:00 +1000)]
i965: report bound buffer size not underlying buffer size for image size (v2)

This seems to make sense, the image is bound to a subset of the buffer
so the image size should be from the bound size not the underlying
object.

This fixes:
GL44-CTS.shader_image_size.advanced-nonMS-fs-int

v2: get mininum of the two values, same as we write to the hw.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoanv: Throw INCOMPATIBLE_DRIVER for non-fatal initialization errors
Jason Ekstrand [Tue, 23 Aug 2016 01:10:14 +0000 (18:10 -0700)]
anv: Throw INCOMPATIBLE_DRIVER for non-fatal initialization errors

The only reason we should throw INITIALIZATION_FAILED is if we have found
useable intel hardware but have failed to bring it up for some reason.
Otherwise, we should just throw INCOMPATIBLE_DRIVER which will turn into
successfully advertising 0 physical devices

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
8 years agost/glsl_to_tgsi: fix st_src_reg_for_double constant.
Dave Airlie [Tue, 5 Jul 2016 00:26:14 +0000 (10:26 +1000)]
st/glsl_to_tgsi: fix st_src_reg_for_double constant.

This needs to set the src swizzle so it doesn't access the .zw
members ever when we are just emitting a 0 constant here.

This fixes:
vert-conversion-explicit-dvec3-bvec3.shader_test
and a bunch of other fp64 tests on softpipe and radeonsi.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agomesa/subroutines: drop the old subroutine index uploads.
Dave Airlie [Tue, 7 Jun 2016 05:25:59 +0000 (15:25 +1000)]
mesa/subroutines: drop the old subroutine index uploads.

We used to upload the indices when they changed, now we rely
on the drivers calling the correct hook to have the values
updated from the context storage.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
8 years agost/mesa: use the new subroutine index upload API.
Dave Airlie [Tue, 7 Jun 2016 05:25:58 +0000 (15:25 +1000)]
st/mesa: use the new subroutine index upload API.

This plugs the new API into the gallium state tracker.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Acked-by: Andres Gomez <agomez@igalia.com>
8 years agoi965: use new subroutine index uploader.
Dave Airlie [Tue, 7 Jun 2016 05:25:57 +0000 (15:25 +1000)]
i965: use new subroutine index uploader.

This plugs the subroutine index updates into the i965 backend,
where it loads constants.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Acked-by: Andres Gomez <agomez@igalia.com>
8 years agomesa: add api to write subroutine indicies to the program storage.
Dave Airlie [Tue, 7 Jun 2016 05:25:56 +0000 (15:25 +1000)]
mesa: add api to write subroutine indicies to the program storage.

This writes the subroutine indicies to the program storage for
a stage. This API is intended to be used by drivers to update
the uniform storage before uploading to the hw.

This isn't the most thread safe effort, but it will be significantly
more multi-context safe.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
8 years agomesa/subroutines: start adding per-context subroutine index support (v1.1)
Dave Airlie [Tue, 7 Jun 2016 05:25:55 +0000 (15:25 +1000)]
mesa/subroutines: start adding per-context subroutine index support (v1.1)

One piece of ARB_shader_subroutine I ignored was the fact that it
needs to store the subroutine index data per context and not per
shader program.

There is one CTS test that tests this:
GL45-CTS.shader_subroutine.multiple_contexts

However the test only does a write to context and readback,
it never renders using the values, so this is enough to fix the
test however not enough to do what the spec says.

So with this patch the info is now stored per context, but
it gets updated into the program at UseProgram and when the
values are inserted into the context, which won't help if
multiple contexts are in use in multiple threads.

v1.1: cleanups and nit-picks (Andres)

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
8 years agovbo: Make #if 0'd debugging code compile.
Matt Turner [Mon, 22 Aug 2016 22:40:48 +0000 (15:40 -0700)]
vbo: Make #if 0'd debugging code compile.

8 years agonir: avoid segfault when ssa src not found
Timothy Arceri [Sun, 21 Aug 2016 08:31:40 +0000 (18:31 +1000)]
nir: avoid segfault when ssa src not found

Without this the following line will segfault and we don't get to
see the results of the validate_assert() above.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
8 years agovc4: Tell state_tracker that we would prefer NIR.
Eric Anholt [Tue, 17 May 2016 20:04:20 +0000 (13:04 -0700)]
vc4: Tell state_tracker that we would prefer NIR.

Before this series, the code generation path was:

GLSL IR -> TGSI -> NIR -> NIR clone -> QIR -> QPU

Now it's (generally)

GLSL IR -> NIR -> NIR clone -> QIR -> QPU

8 years agost/nir: Trim out unused VS input variables.
Eric Anholt [Sat, 20 Aug 2016 00:12:12 +0000 (17:12 -0700)]
st/nir: Trim out unused VS input variables.

If we're going to skip setting up vertex input data in them, we should
probably not leave them as vertex inputs with a driver_location that
happens to alias to something else.

Fixes a regression in glsl-mat-attribute on vc4 when enabling GTN.

v2: Change commit message shortlog, lower the new globals away before
    handing off to the driver.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agonir: Fix crash in nir_lower_drawpixels.
Eric Anholt [Fri, 19 Aug 2016 22:38:08 +0000 (15:38 -0700)]
nir: Fix crash in nir_lower_drawpixels.

Generally you'd see the gl_Color reference first and get some cursor set.
However, in piglit draw-pixel-with-texture we're now seeing the TexCoord
dereferenced first.

Reviewed-by: Rob Clark <robdclark@gmail.com>
8 years agonir: Fix a comment typo in nir_lower_drawpixels.
Eric Anholt [Fri, 19 Aug 2016 22:37:52 +0000 (15:37 -0700)]
nir: Fix a comment typo in nir_lower_drawpixels.

Reviewed-by: Rob Clark <robdclark@gmail.com>
8 years agovc4: Use proper type sizes for uniforms.
Eric Anholt [Wed, 27 Jul 2016 22:03:39 +0000 (15:03 -0700)]
vc4: Use proper type sizes for uniforms.

8 years agovc4: Add VARYING_SLOT_PNTC support.
Eric Anholt [Fri, 5 Aug 2016 06:06:36 +0000 (23:06 -0700)]
vc4: Add VARYING_SLOT_PNTC support.

We end up with this when doing GLSL-to-NIR.

8 years agovc4: Fix vc4_nir_lower_io for non-vec4 I/O.
Eric Anholt [Tue, 17 May 2016 20:46:51 +0000 (13:46 -0700)]
vc4: Fix vc4_nir_lower_io for non-vec4 I/O.

To support GLSL-to-NIR, we need to be able to support actual
float/vec2/vec3 varyings.

8 years agonir: Define system values for vc4's blending-lowering arguments.
Eric Anholt [Wed, 27 Jul 2016 00:31:44 +0000 (17:31 -0700)]
nir: Define system values for vc4's blending-lowering arguments.

In the GLSL-to-NIR conversion of VC4, I had a bit of trouble with what I
was calling the "state uniforms" that I was putting into the NIR fighting
with its other lowering passes.  Instead of using magic uniform base
numbers in the backend, follow the lead of load_user_clip_plane and just
define system values for them.

v2: Fix unintended change to channel_num, drop unspecified const_index
    value on blend_const_color_r_float.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agoanv: GetDeviceImageFormatProperties: fix TRANSFER formats
Lionel Landwerlin [Sat, 13 Aug 2016 00:00:57 +0000 (01:00 +0100)]
anv: GetDeviceImageFormatProperties: fix TRANSFER formats

We let the user believe we support some transfer formats which we don't.
This can lead to crashes when actually trying to use those formats for
example on dEQP-VK.api.copy_and_blit.image_to_image.* tests.

Let all formats we can render to or sample from as meta implements transfers
using attachments.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agogallium/hud: round max_value to print nicely rounded numbers next to graphs
Marek Olšák [Thu, 18 Aug 2016 17:42:16 +0000 (19:42 +0200)]
gallium/hud: round max_value to print nicely rounded numbers next to graphs

This improves readability a lot.

Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agogallium/hud: generalize code for drawing numbers next to graphs
Marek Olšák [Thu, 18 Aug 2016 15:54:13 +0000 (17:54 +0200)]
gallium/hud: generalize code for drawing numbers next to graphs

Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agogallium/hud: draw numbers with 3 decimal places if those aren't 0
Marek Olšák [Thu, 18 Aug 2016 15:21:54 +0000 (17:21 +0200)]
gallium/hud: draw numbers with 3 decimal places if those aren't 0

Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agogallium/hud: use sRGB for nicer AA lines
Marek Olšák [Wed, 17 Aug 2016 16:27:48 +0000 (18:27 +0200)]
gallium/hud: use sRGB for nicer AA lines

Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agogallium/hud: use AA lines for graphs
Marek Olšák [Wed, 17 Aug 2016 16:13:46 +0000 (18:13 +0200)]
gallium/hud: use AA lines for graphs

this looks a lot better (with the next patch)

Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agogallium/hud: don't enable blending for all objects
Marek Olšák [Wed, 17 Aug 2016 16:13:01 +0000 (18:13 +0200)]
gallium/hud: don't enable blending for all objects

Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agoutil: add assert that key cannot be NULL on insertion
Tapani Pälli [Fri, 19 Aug 2016 11:33:13 +0000 (14:33 +0300)]
util: add assert that key cannot be NULL on insertion

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
8 years agoglsl: fix key used for hashing switch statement cases
Tapani Pälli [Fri, 19 Aug 2016 10:44:54 +0000 (13:44 +0300)]
glsl: fix key used for hashing switch statement cases

Implementation previously used value itself as the key, however after
hash implementation change by ee02a5e we cannot use 0 as key.

v2: use constant pointer as the key and implement comparison
    for contents (Eric Anholt)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97309

8 years agoandroid: i965: add per-gen libmesa_i965_gen{8,9} static
Mauro Rossi [Fri, 19 Aug 2016 22:04:29 +0000 (00:04 +0200)]
android: i965: add per-gen libmesa_i965_gen{8,9} static

Needed to fix android build after commit 16a9fcb
which enabled genxml for gen{8,9} state setup

This is the last patch needed, android build tested successfully.

8 years agoandroid: i965: add per-gen libmesa_i965_gen{7,75} static libraries
Mauro Rossi [Fri, 19 Aug 2016 21:55:54 +0000 (23:55 +0200)]
android: i965: add per-gen libmesa_i965_gen{7,75} static libraries

Needed to fix android build after commit e198983
which enabled genxml for gen{7,75} state setup

Android build fix for gen{8,9} will follow as incremental patch,
build tested successfully with all per-gen patches applied.

8 years agoandroid: i965: add per-gen libmesa_i965_gen6 static library
Mauro Rossi [Fri, 19 Aug 2016 21:36:11 +0000 (23:36 +0200)]
android: i965: add per-gen libmesa_i965_gen6 static library

Needed to fix android build after commit c8bc1ae
where new per-gen genX_blorp.c source replaced gen6_blorp.c for gen6

Android build fixes for gen{7,75} and gen{8,9} will follow as incremental patches,
build tested successfully with all per-gen patches applied.

8 years agoglsl: Rename link_fs_input_layout_qualifiers to "inout".
Kenneth Graunke [Tue, 28 Jun 2016 17:00:18 +0000 (10:00 -0700)]
glsl: Rename link_fs_input_layout_qualifiers to "inout".

We're going to handle output qualifiers here too, and calling it "inout"
seems to be the going convention.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
8 years agoi965/cfg: Factor common code out of switch statement.
Matt Turner [Wed, 17 Aug 2016 18:40:01 +0000 (11:40 -0700)]
i965/cfg: Factor common code out of switch statement.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
8 years agoanv: Give the installed intel_icd.json file an absolute path
Jason Ekstrand [Fri, 19 Aug 2016 16:01:14 +0000 (09:01 -0700)]
anv: Give the installed intel_icd.json file an absolute path

Not providing a path allows the ICD to work on multi-arch systems but
breaks it if you install anywhere other than /usr/lib.  Given that users
may be installing locally in .local or similar, we probably do want to
provide a filename.  Distros can carry a revert of this commit if they want
an intel_icd.json file without the path.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Chad Versace <chad@kiwitree.net>
8 years agomesa: Fix fixed function spot lighting on newer hardware (again)
Daniel Scharrer [Sat, 20 Aug 2016 02:23:29 +0000 (04:23 +0200)]
mesa: Fix fixed function spot lighting on newer hardware (again)

This was first fixed in commit b3f9c5c and then broken again in commit
fe2d2c7, which removed the abs modifier from input registers.

v2: Don't change the size of struct ureg.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91342
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Daniel Scharrer <daniel@constexpr.org>
8 years agoi965: Remove comment within a comment.
Matt Turner [Fri, 19 Aug 2016 23:54:42 +0000 (16:54 -0700)]
i965: Remove comment within a comment.

8 years agollvmpipe: fix issues with depth clamp
Roland Scheidegger [Sat, 20 Aug 2016 02:03:11 +0000 (04:03 +0200)]
llvmpipe: fix issues with depth clamp

We only did depth clamp when the value was written from the fs.
This is very wrong both for d3d10 and GL, and only passed the
corresponding piglit test due to pure luck (it no longer does
with the enhanced test).
Also, interpolation clamped values to 1.0 always, which can legitimately
happen if depth clip is disabled, so fix that as well (untested).
There is one unresolved issue left, d3d10 always does depth clamping,
whereas GL does not (but does [0,1] clamp instead for fs depth outputs)
- this information isn't in any gallium state object, leave it as-is
for now (though it looks like llvmpipe misses the [0,1] clamp as well).
This (with the previous patch) fixes piglit depth-clamp-range test.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
8 years agollvmpipe: fix depth clamping wrt reversed near/far values
Roland Scheidegger [Mon, 15 Aug 2016 03:22:30 +0000 (05:22 +0200)]
llvmpipe: fix depth clamping wrt reversed near/far values

This wasn't handled before (the result was that no matter what value got
clamped, it always ended up as the near value in this case) (if clamping
actually happened).
Fix this by using the util helper for that (the math is otherwise "mostly"
the same, mostly because there could actually be differences due to float
rounding, but I don't even know which one would be more correct).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
8 years agoi965/sched: Simplify work done by add_barrier_deps().
Matt Turner [Thu, 18 Aug 2016 23:47:05 +0000 (16:47 -0700)]
i965/sched: Simplify work done by add_barrier_deps().

Scheduling barriers are implemented by placing a dependence on every
node before and after the barrier. This is unnecessary as we can limit
the number of nodes we place dependencies on to those between us and the
next barrier in each direction.

Runtime of dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23
is reduced from ~25 minutes to a little more than three.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94681
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agoi965/vec4: Ignore swizzle of VGRF for use by var_range_end().
Matt Turner [Thu, 18 Aug 2016 22:54:47 +0000 (15:54 -0700)]
i965/vec4: Ignore swizzle of VGRF for use by var_range_end().

var_range_end(v, n) loops over the n components of variable number v and
finds the maximum value, giving the last use of any component of v.
Therefore it expects v to correspond to the variable associated with the
.x channel of the VGRF.

var_from_reg() however returns the variable for the first channel of the
VGRF, post-swizzle.

So, if the last register had a swizzle with y, z, or w in the swizzle
component, we would read out of bounds. For any other register, we would
read liveness information from the next register.

The fix is to convert the src_reg to a dst_reg in order to call the
dst_reg version of var_from_reg() that doesn't consider the swizzle.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agoi965/vec4: Print spills:fills.
Matt Turner [Fri, 12 Aug 2016 18:44:26 +0000 (11:44 -0700)]
i965/vec4: Print spills:fills.

Allows shader-db to work on vec4 programs (has been broken since
shader-db commit 646df5ca98b2 from April!)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agoa4xx: make sure to actually clamp depth as requested
Ilia Mirkin [Mon, 15 Aug 2016 03:58:18 +0000 (23:58 -0400)]
a4xx: make sure to actually clamp depth as requested

We were previously ... not clamping. I guess this meant that everything
got clamped to 1/0, which was enough to pass the existing tests. Or
perhaps the clamping would only happen to the rasterized depth value and
not the frag shader's output depth value.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
8 years agoa4xx: only disable depth clipping, not all clipping, when requested
Ilia Mirkin [Fri, 19 Aug 2016 00:12:29 +0000 (20:12 -0400)]
a4xx: only disable depth clipping, not all clipping, when requested

The previous bit disables the whole clipper, including the regular
viewport-related clipping that would go on. The two new bits disable
near and far clipping (separately, as verified with the
depth-clamp-range piglit).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
8 years agovc4: Switch store_output to using nir_lower_io_to_scalar / component.
Eric Anholt [Fri, 5 Aug 2016 00:31:02 +0000 (17:31 -0700)]
vc4: Switch store_output to using nir_lower_io_to_scalar / component.

8 years agovc4: Use the intrinsic's first_component for vattr VPM index.
Eric Anholt [Thu, 4 Aug 2016 23:33:16 +0000 (16:33 -0700)]
vc4: Use the intrinsic's first_component for vattr VPM index.

Avoids another multiplication by 4 of the base in the NIR.

8 years agovc4: Convert to using nir_lower_io_scalar for FS inputs.
Eric Anholt [Thu, 4 Aug 2016 22:00:37 +0000 (15:00 -0700)]
vc4: Convert to using nir_lower_io_scalar for FS inputs.

The scalarizing of FS inputs can be done in a non-driver-dependent manner,
so extract it out of the driver.