mesa.git
7 years agoanv/blorp: Disable resolves for transparent black clears
Nanley Chery [Sat, 21 Jan 2017 21:35:50 +0000 (13:35 -0800)]
anv/blorp: Disable resolves for transparent black clears

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv/cmd_buffer: Don't temporarily enable CCS_E within a render pass
Nanley Chery [Thu, 19 Jan 2017 18:21:38 +0000 (10:21 -0800)]
anv/cmd_buffer: Don't temporarily enable CCS_E within a render pass

Compressing a render target and decompressing it in the same
single-subpass render pass may waste bandwidth. While this may be
beneficial in some circumstances, it does not help in all. Reclaims
about 1.95% FPS for Dota 2 on some configurations.

v2 (Jason Ekstrand):
- Provide a more thorough comment
- Enable CCS_D for input attachments
v3 (Jason Ekstrand):
- Provide performance numbers

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agomesa: Don't crash when destroying contexts created with no visual.
Kenneth Graunke [Thu, 2 Feb 2017 18:10:30 +0000 (10:10 -0800)]
mesa: Don't crash when destroying contexts created with no visual.

dEQP-EGL.functional.create_context.no_config tries to create a context
with no config, then immediately destroys it.  The drawbuffer is never
set up, so we can't dereference it asking if it's double buffered, or
we'll crash on a null pointer dereference.

Just bail early.

Applications using EGL_KHR_no_config_context could hit this.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
7 years agowinsys/amdgpu: avoid potential segfault in amdgpu_bo_map()
Samuel Pitoiset [Thu, 2 Feb 2017 17:40:18 +0000 (18:40 +0100)]
winsys/amdgpu: avoid potential segfault in amdgpu_bo_map()

cs can be NULL when it comes from r600_buffer_map_sync_with_rings()
to avoid doing the same checks. It was checked for write mappings
but not for read mappings.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoandroid: fix droid_create_image_from_prime_fd_yuv for YV12
Tapani Pälli [Thu, 2 Feb 2017 12:05:46 +0000 (14:05 +0200)]
android: fix droid_create_image_from_prime_fd_yuv for YV12

Earlier changes introduced is_ycrcb flag which checks the component
order of u and v components. Condition for setting the flag was
incorrect, with ycrcb we are supposed to have cr before cb.

This patch (together with a fix in our gralloc) fixes corrupted
rendering from 'test-opengl-gl2_yuvtex' native test and corrupted
gallery thumbnail in application switcher on Android-IA.

Fixes: 51727b1cf57e8c4630767eb9ead207b102ffa489
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
7 years agoilo: EOL unmaintained older gallium intel driver
Edward O'Callaghan [Tue, 6 Dec 2016 00:07:13 +0000 (11:07 +1100)]
ilo: EOL unmaintained older gallium intel driver

This is no longer actively maintained and is just
accumulating bitrot.

Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Acked-by: Chia-I Wu <olvaffe@gmail.com>
7 years agoilo: EOL drop unmaintained gallium drv from buildsys
Edward O'Callaghan [Wed, 1 Feb 2017 13:56:14 +0000 (00:56 +1100)]
ilo: EOL drop unmaintained gallium drv from buildsys

This is no longer actively maintained and is just
accumulating bitrot.

Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Acked-by: Chia-I Wu <olvaffe@gmail.com>
7 years agoilo: EOL unplumb unmaintained gallium drv from winsys
Edward O'Callaghan [Wed, 1 Feb 2017 14:17:07 +0000 (01:17 +1100)]
ilo: EOL unplumb unmaintained gallium drv from winsys

This is no longer actively maintained and is just
accumulating bitrot.

Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Acked-by: Chia-I Wu <olvaffe@gmail.com>
7 years agoconfigure: libdrm is a single package
Ilia Mirkin [Thu, 2 Feb 2017 02:29:12 +0000 (21:29 -0500)]
configure: libdrm is a single package

The intent of the libdrm_$driver version limits has always been to not
burden the "other" drivers with updating their libdrm unless really
necessary. Unfortunately the configure script erroneously only checked
the driver-specific bit and not the generic bit of libdrm as well. Fix
this.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agost/mesa: MAX_VARYING is the max supported number of patch varyings, not min
Ilia Mirkin [Thu, 26 Jan 2017 03:31:58 +0000 (22:31 -0500)]
st/mesa: MAX_VARYING is the max supported number of patch varyings, not min

This fixes
GL45-CTS.tessellation_shader.tessellation_shader_tessellation.max_in_out_attributes
on nouveau. We only support 30 patch varyings (as 2 vec4 slots end up
being used for tess level settings), but were getting 32 exposed.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
7 years agovbo: process buffer binding state changes on draw when recording
Ilia Mirkin [Wed, 1 Feb 2017 21:11:41 +0000 (16:11 -0500)]
vbo: process buffer binding state changes on draw when recording

The VBO module keeps track of any vbo buffers. It updates this list when
receiving an InvalidateState call, however this never happens when
recording draws right now. Make sure that we do all the usual state
updates when recording draws so that the VBO list may be kept up to
date.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99631
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
7 years agoradv/ac: move to using shared emit_ddxy code.
Dave Airlie [Wed, 1 Feb 2017 23:55:45 +0000 (09:55 +1000)]
radv/ac: move to using shared emit_ddxy code.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradeonsi/ac: move most of emit_ddxy to shared code.
Dave Airlie [Wed, 1 Feb 2017 23:52:52 +0000 (09:52 +1000)]
radeonsi/ac: move most of emit_ddxy to shared code.

We can reuse this in radv.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv/ac: use shared thread id code
Dave Airlie [Wed, 1 Feb 2017 23:39:59 +0000 (09:39 +1000)]
radv/ac: use shared thread id code

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradeonsi/ac: move get thread id to shared code.
Dave Airlie [Wed, 1 Feb 2017 23:35:40 +0000 (09:35 +1000)]
radeonsi/ac: move get thread id to shared code.

radv will use this.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv/ac: migrate to using shared code for some load/store stuff.
Dave Airlie [Wed, 1 Feb 2017 23:18:22 +0000 (09:18 +1000)]
radv/ac: migrate to using shared code for some load/store stuff.

This migrates to the code shared with radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradeonsi/ac: move tbuffer store and buffer load to shared code.
Dave Airlie [Wed, 1 Feb 2017 23:13:44 +0000 (09:13 +1000)]
radeonsi/ac: move tbuffer store and buffer load to shared code.

These are all reuseable by radv.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradeonsi/ac: move a bunch of load/store related things to common code.
Dave Airlie [Wed, 1 Feb 2017 22:58:57 +0000 (08:58 +1000)]
radeonsi/ac: move a bunch of load/store related things to common code.

These are all shareable with radv, so start migrating them to the
common code.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agotexgetimage: Add check for the effective target to GetTextureSubImage
Eduardo Lima Mitev [Thu, 2 Feb 2017 16:07:24 +0000 (17:07 +0100)]
texgetimage: Add check for the effective target to GetTextureSubImage

OpenGL 4.5 spec, section "8.11.4 Texture Image Queries", page 233 of
the PDF states:

    "An INVALID_OPERATION error is generated if texture is the name of a buffer
     or multisample texture."

This is currently not being checked and e.g a multisample texture image can
be passed down to the driver hook. On i965, it is crashing the driver with an
assertion:

intel_mipmap_tree.c:3125: intel_miptree_map: Assertion `mt->num_samples <= 1' failed.

v2: (Ilia Mirkin) Move the check from gettextimage_error_check() to
    GetTextureSubImage() and use the texObj target.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
7 years agoRevert "radeonsi: decrease the number of texture slots to 24"
Marek Olšák [Thu, 2 Feb 2017 18:42:22 +0000 (19:42 +0100)]
Revert "radeonsi: decrease the number of texture slots to 24"

This reverts commit bdd860e3076655519d45bd66936ef7be9b7dda63.

Requested by a game developer.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoconfigure.ac: explicitly require libdrm for dri classic drivers.
Dave Airlie [Thu, 2 Feb 2017 04:27:50 +0000 (14:27 +1000)]
configure.ac: explicitly require libdrm for dri classic drivers.

Although this might come from somewhere else require it explicitly.

Reviewed-by: Chad Versace <chadversary@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agointel/isl: Add a better comment for format_supports_ccs_e
Jason Ekstrand [Thu, 2 Feb 2017 17:51:55 +0000 (09:51 -0800)]
intel/isl: Add a better comment for format_supports_ccs_e

Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
7 years agoanv: Remove the finishme for CCS_E with storage images
Jason Ekstrand [Wed, 1 Feb 2017 20:05:07 +0000 (12:05 -0800)]
anv: Remove the finishme for CCS_E with storage images

The data port can't handle CCS at all so replace the finishme with
better comments.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
7 years agointel/isl: Assert that we don't use CCS for storage images
Jason Ekstrand [Wed, 1 Feb 2017 22:34:27 +0000 (14:34 -0800)]
intel/isl: Assert that we don't use CCS for storage images

I enabled CCS for storage images in the Vulkan driver and ran it through
the CTS.  It didn't result in any hangs but it demonstrated that the data
port cannot handle CCS.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
7 years agointel/isl: Add a formats_are_ccs_e_compatible helper
Jason Ekstrand [Wed, 1 Feb 2017 19:46:54 +0000 (11:46 -0800)]
intel/isl: Add a formats_are_ccs_e_compatible helper

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
7 years agointel/isl: Add a format_supports_ccs_d helper
Jason Ekstrand [Wed, 1 Feb 2017 19:41:51 +0000 (11:41 -0800)]
intel/isl: Add a format_supports_ccs_d helper

Nothing uses this yet but it serves as a nice bit of documentation
that's relatively easy to find.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
7 years agointel/isl: Rename supports_lossless_compression to supports_ccs_e
Jason Ekstrand [Wed, 1 Feb 2017 19:39:26 +0000 (11:39 -0800)]
intel/isl: Rename supports_lossless_compression to supports_ccs_e

The term "lossless compression" could potentially mean multisample
color compression, single-sample color compression or HiZ because they
are all lossless.  The term CCS_E, however, has a very precise meaning;
in ISL and is only used to refer to single-sample color compression.
It's also much shorter which is nice.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
7 years agoanv/pass: Store the depth-stencil attachment's last subpass index
Nanley Chery [Wed, 1 Feb 2017 03:01:18 +0000 (19:01 -0800)]
anv/pass: Store the depth-stencil attachment's last subpass index

Commit 968ffd6c868af7226e8f889573eef709888151cb stored the last subpass
index of all the attachments but that of the depth-stencil attachment.
This could cause depth buffers used in multiple subpasses not to be in
the requested final layout. Fix this error.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
7 years agogallium: turn PIPE_SHADER_CAP_DOUBLES into a screen capability
Nicolai Hähnle [Fri, 27 Jan 2017 09:35:13 +0000 (10:35 +0100)]
gallium: turn PIPE_SHADER_CAP_DOUBLES into a screen capability

Make the cap consistent with PIPE_CAP_INT64.

Aside from the hypothetical case of using draw for vertex shaders (and
actually caring about doubles...), every implementation supports doubles
either nowhere or everywhere.

Also, st/mesa didn't even check the cap correctly in all supported
shader stages.

While at it, add a missing LLVM version check for 64-bit integers in
radeonsi. This is conservative: judging by the log, LLVM 3.8 might be
sufficient, but there are probably bugs that have been fixed since then.

v2: fix clover (Marek)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agomesa: Enable EXT_compressed_ETC1_RGB8_sub_texture
Plamena Manolova [Tue, 24 Jan 2017 12:39:06 +0000 (12:39 +0000)]
mesa: Enable EXT_compressed_ETC1_RGB8_sub_texture

Since we already have the functionality in place and games
like Game of Thrones seem to depend on this extension, I
think it makes sense to enable it by making it part of
the extension string even though it's still a draft:

https://www.khronos.org/registry/gles/extensions/EXT/EXT_compressed_ETC1_RGB8_sub_texture.txt

Note: OES_compressed_ETC1_RGB8_sub_texture seems to be listed
in gl2ext.h, but there's no documentation for it in the KHR
registry

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
7 years agoconfigure: Only require libdrm 2.4.75 for intel.
Vinson Lee [Wed, 1 Feb 2017 23:28:31 +0000 (23:28 +0000)]
configure: Only require libdrm 2.4.75 for intel.

Fixes: b8acb6b17981 ("configure: Require libdrm >= 2.4.75")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Dave Airlie <airlied@redhat.com>
7 years agoanv: enable VK_KHR_shader_draw_parameters
Lionel Landwerlin [Sun, 29 Jan 2017 04:14:54 +0000 (04:14 +0000)]
anv: enable VK_KHR_shader_draw_parameters

Enables 10 tests from:

   dEQP-VK.draw.shader_draw_parameters.*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv: emit DrawID if needed
Lionel Landwerlin [Sun, 29 Jan 2017 03:15:03 +0000 (03:15 +0000)]
anv: emit DrawID if needed

v2: use define for buffer ID (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv: always allocate a vertex element with vertexid or instanceid
Lionel Landwerlin [Tue, 31 Jan 2017 12:41:32 +0000 (12:41 +0000)]
anv: always allocate a vertex element with vertexid or instanceid

Up to now on Gen8+ we only allocated a vertex element for
gl_InstanceIndex or gl_VertexIndex when a vertex shader uses
gl_BaseInstanceARB or gl_BaseVertexARB. This is because we would
configure the VF_SGVS packet to make the VF unit write the
gl_InstanceIndex & gl_VertexIndex values right behind the values
computed from the vertex buffers.

In the next commit we will also write the gl_DrawIDARB value. Our
backend expects to pull the gl_DrawIDARB value from the element
following the element containing gl_InstanceIndex, gl_VertexIndex,
gl_BaseInstanceARB and gl_BaseVertexARB (see
vec4_vs_visitor::setup_attributes). Therefore we need to allocate an
element for the SGVS elements as long as at least one of the SGVS
element is read by the shader. Otherwise our shader will use a
gl_DrawIDARB value pulled from the URB one element too far (most
likely garbage).

v2: Fix my english (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv: move BaseVertexID/BaseInstanceID vertex buffer index to 31
Lionel Landwerlin [Sun, 29 Jan 2017 02:46:12 +0000 (02:46 +0000)]
anv: move BaseVertexID/BaseInstanceID vertex buffer index to 31

v2: use define for buffer ID (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv: limit vertex buffers to 31
Lionel Landwerlin [Sun, 29 Jan 2017 02:44:32 +0000 (02:44 +0000)]
anv: limit vertex buffers to 31

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoandroid: fix llvm, elf dependencies for M, N releases
Mauro Rossi [Mon, 30 Jan 2017 19:57:30 +0000 (20:57 +0100)]
android: fix llvm, elf dependencies for M, N releases

These changes set the correct llvm version and elf include path
which differ for Marshmallow and Nougat

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
7 years agoanv: Don't use bogus alpha swizzles
Jason Ekstrand [Wed, 1 Feb 2017 20:27:59 +0000 (12:27 -0800)]
anv: Don't use bogus alpha swizzles

For RGB formats in Vulkan, we use the corresponding RGBA format with a
swizzle of RGB1.  While this swizzle is exactly what we want for
texturing, it's not allowed for rendering according to the docs.  While
we haven't been getting hangs or anything, we should probably obey the
docs.  This commit just sanitizes all render swizzles so that the alpha
channel maps to ALPHA.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
7 years agoAdd missing copyright header to wayland-egl-priv.h
Micah Fedke [Thu, 12 Jan 2017 15:38:21 +0000 (10:38 -0500)]
Add missing copyright header to wayland-egl-priv.h

Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoradv: handle VK_QUEUE_FAMILY_IGNORED in image transitions (v3)
Dave Airlie [Tue, 31 Jan 2017 05:18:33 +0000 (15:18 +1000)]
radv: handle VK_QUEUE_FAMILY_IGNORED in image transitions (v3)

The CTS tests at least are using this, and we were totally
ignoring it.

This hopefully fixes the bouncing multisample CTS tests.

v2: get family mask in ignored case from command buffer.
v3: only change things in one place, use logic from Bas.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv/ac: handle clip/cull distance sizing in geometry shader outputs
Dave Airlie [Wed, 1 Feb 2017 01:10:49 +0000 (11:10 +1000)]
radv/ac: handle clip/cull distance sizing in geometry shader outputs

Otherwise we were writing these as 4 components, and things went bad.

Fixes (the remaining):
dEQP-VK.clipping.user_defined.*.vert_geom.*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv/ac: add const_index to fetch index for gs inputs
Dave Airlie [Wed, 1 Feb 2017 00:43:36 +0000 (10:43 +1000)]
radv/ac: add const_index to fetch index for gs inputs

This fixes clip distance fetches as they are single item loads
with a const_index like float[1].

Fixes:
dEQP-VK.clipping.user_defined.*.vert_geom.[0-6]

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradeonsi/ac: move frag interp emission code to shared llvm code.
Dave Airlie [Wed, 1 Feb 2017 04:47:45 +0000 (14:47 +1000)]
radeonsi/ac: move frag interp emission code to shared llvm code.

This code should be used in radv, so move it to a shared location
in advance of doing that.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agost/mesa: inline get_mesa_program()
Timothy Arceri [Wed, 1 Feb 2017 00:25:05 +0000 (11:25 +1100)]
st/mesa: inline get_mesa_program()

In the past I've gotten this function confused with the one in
ir_to_mesa.cpp of the same name. Now that the affected flag setting
has move into a helper it makes sense just to inline this remaining
code.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agost/mesa: create set_prog_affected_state_flags() helper
Timothy Arceri [Tue, 31 Jan 2017 06:15:09 +0000 (17:15 +1100)]
st/mesa: create set_prog_affected_state_flags() helper

This will be used when restoring tgsi from the on-disk shader
cache.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agost/mesa: st_atom_shader.c C99 tidy up
Timothy Arceri [Mon, 30 Jan 2017 23:34:59 +0000 (10:34 +1100)]
st/mesa: st_atom_shader.c C99 tidy up

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agost/mesa: remove pre C99 statement block for variable declaration
Timothy Arceri [Mon, 30 Jan 2017 23:20:41 +0000 (10:20 +1100)]
st/mesa: remove pre C99 statement block for variable declaration

Acked-by: Marek Olšák <marek.olsak@amd.com>
7 years agoisl: Add assertions for render target swizzle restrictions
Jason Ekstrand [Wed, 1 Feb 2017 00:17:26 +0000 (16:17 -0800)]
isl: Add assertions for render target swizzle restrictions

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
7 years agost/va: add h264 constrained baseline profile
Boyuan Zhang [Fri, 16 Dec 2016 21:20:54 +0000 (16:20 -0500)]
st/va: add h264 constrained baseline profile

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
7 years agost/vdpau: add h264 constrained baseline profile
Boyuan Zhang [Fri, 16 Dec 2016 20:24:03 +0000 (15:24 -0500)]
st/vdpau: add h264 constrained baseline profile

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
7 years agoradeon/uvd: add h264 constrained baseline support
Boyuan Zhang [Fri, 16 Dec 2016 20:22:13 +0000 (15:22 -0500)]
radeon/uvd: add h264 constrained baseline support

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
7 years agovl: add h264 constrained baseline profile
Boyuan Zhang [Fri, 16 Dec 2016 20:19:25 +0000 (15:19 -0500)]
vl: add h264 constrained baseline profile

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
7 years agoradv: Enable VK_KHR_shader_draw_parameters.
Bas Nieuwenhuizen [Tue, 31 Jan 2017 20:37:48 +0000 (21:37 +0100)]
radv: Enable VK_KHR_shader_draw_parameters.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
7 years agoradv: Pass draw index to shader.
Bas Nieuwenhuizen [Tue, 31 Jan 2017 20:25:41 +0000 (21:25 +0100)]
radv: Pass draw index to shader.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
7 years agoradv/ac: Add draw index support.
Bas Nieuwenhuizen [Tue, 31 Jan 2017 20:21:47 +0000 (21:21 +0100)]
radv/ac: Add draw index support.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
7 years agoi965: Prevent coverity warning
Robert Foss [Wed, 1 Feb 2017 16:24:39 +0000 (11:24 -0500)]
i965: Prevent coverity warning

Add assert checking that num_sources is never larger than 3.

This prevents Coverity from concluding that the unhandled
cases of num_sources not being 0-3 are relevant.

Coverity-Id: 1399480-1399489
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
7 years agospirv: add SPV_KHR_shader_draw_parameters support
Lionel Landwerlin [Wed, 25 Jan 2017 13:58:14 +0000 (13:58 +0000)]
spirv: add SPV_KHR_shader_draw_parameters support

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
7 years agocompiler: add missing enums for debug
Lionel Landwerlin [Mon, 30 Jan 2017 17:58:56 +0000 (17:58 +0000)]
compiler: add missing enums for debug

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
7 years agodocs: add news item and link release notes for 13.0.4
Emil Velikov [Wed, 1 Feb 2017 11:21:59 +0000 (11:21 +0000)]
docs: add news item and link release notes for 13.0.4

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agodocs: add sha256 checksums for 13.0.4
Emil Velikov [Wed, 1 Feb 2017 11:19:37 +0000 (11:19 +0000)]
docs: add sha256 checksums for 13.0.4

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 6bfc352f5a35ab21f012d6d501821ffbf767aab3)

7 years agodocs: add release notes for 13.0.4
Emil Velikov [Wed, 1 Feb 2017 10:10:38 +0000 (10:10 +0000)]
docs: add release notes for 13.0.4

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 3255d10da4c2703bfdfcefd8f59b0d8f21dbb43f)

7 years agowinsys/radeon: Allow visible VRAM size > 256MB with kernel driver >= 2.49
Michel Dänzer [Tue, 31 Jan 2017 06:33:19 +0000 (15:33 +0900)]
winsys/radeon: Allow visible VRAM size > 256MB with kernel driver >= 2.49

The kernel driver reports correct values now.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
7 years agoandroid: add vulkan build for intel
Tapani Pälli [Mon, 30 Jan 2017 11:37:47 +0000 (13:37 +0200)]
android: add vulkan build for intel

fixes to issues spotted by Emil Velikov:

   - set ANV_TIMESTAMP corretly
   - fix typo with VULKAN_GEM_FILES

v2: update to use Makefile.sources under vulkan
    instead of having own

v3: update to changes to generate from vk.xml
    (commit c7fc310)

v4: remove 'hw' relative path
    cleanups, remove unnecessary cruft

    review from Emil Velikov:

    - move to vulkan folder
    - remove timestamp gen, no longer necessary
    - more cleanups

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agomesa: use same is_color_attachment trick to discern error cases
Ilia Mirkin [Tue, 24 Jan 2017 05:26:29 +0000 (00:26 -0500)]
mesa: use same is_color_attachment trick to discern error cases

All the other calls to retrieve the attachment have been covered except
this one - return the proper error for attachment points that are valid
enums but out of bound for the driver.

Fixes GL45-CTS.geometry_shader.layered_fbo.fb_texture_invalid_attachment

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoanv: Improve flushing around STATE_BASE_ADDRESS
Jason Ekstrand [Tue, 31 Jan 2017 03:53:17 +0000 (19:53 -0800)]
anv: Improve flushing around STATE_BASE_ADDRESS

It is not clear from the docs exactly how pipelined STATE_BASE_ADDRESS
actually is.  We know from experimentation that we need to flush the
render cache prior to emitting STATE_BASE_ADDRESS and invalidate the
texture cache afterwards.  The only thing the PRM says is that, on gen8+
we're supposed to invalidate the state cache after STATE_BASE_ADDRESS
but experimentation has indicated that doing so does nothing whatsoever.

Since we don't really know, let's do just a bit more flushing in the
hopes that this won't be a problem again.  In particular:

 1) Do a CS stall before we emit STATE_BASE_ADDRESS since we don't
    really know whether or not it's pipelined.

 2) Do a data cache flush in case what runs before STATE_BASE_ADDRESS
    is a compute shader.

 3) Invalidate the state and constant caches after STATE_BASE_ADDRESS
    because the state may be getting cached there (we don't really know).

Reported-by: Mark Janes <mark.a.janes@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
7 years agoanv: Flush render cache before STATE_BASE_ADDRESS on gen7
Jason Ekstrand [Tue, 31 Jan 2017 23:06:56 +0000 (15:06 -0800)]
anv: Flush render cache before STATE_BASE_ADDRESS on gen7

We had no good reason for *not* doing this on gen7 before but we didn't
know it was needed.  Recently, when trying update to Vulkan CTS version
1.0.2 in our CI system, Mark discovered GPU hangs on Haswell that appear
to be STATE_BASE_ADDRESS related.  This commit fixes them.

Reported-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
7 years agoisl/formats: Only advertise sampling for A4B4G4R4 on Broadwell
Jason Ekstrand [Fri, 27 Jan 2017 20:31:40 +0000 (12:31 -0800)]
isl/formats: Only advertise sampling for A4B4G4R4 on Broadwell

This causes hangs on Broadwell if you try to render to it.  I have no
idea how we managed to not hit this earlier.

Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
7 years agointel/blorp: Handle clearing of A4B4G4R4 on all platforms
Jason Ekstrand [Fri, 27 Jan 2017 20:32:05 +0000 (12:32 -0800)]
intel/blorp: Handle clearing of A4B4G4R4 on all platforms

Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
7 years agoradeonsi: Fix build on LLVM < 3.9 v2
Tom Stellard [Wed, 1 Feb 2017 00:18:01 +0000 (00:18 +0000)]
radeonsi: Fix build on LLVM < 3.9 v2

This was broken by: e0cc0a614c96011958bc3a1b84da9168e0e1ccbb

v2:
  - Use preprocessor macro

Tested-by: Mark Janes <mark.a.janes@intel.com>
7 years agoradv: Enable Float64 support.
Bas Nieuwenhuizen [Sun, 29 Jan 2017 22:07:10 +0000 (23:07 +0100)]
radv: Enable Float64 support.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv/ac: Implement Float64 SSBO loads.
Bas Nieuwenhuizen [Sun, 8 Jan 2017 18:38:28 +0000 (19:38 +0100)]
radv/ac: Implement Float64 SSBO loads.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv/ac: Implement Float64 UBO loads.
Bas Nieuwenhuizen [Sun, 8 Jan 2017 00:36:30 +0000 (01:36 +0100)]
radv/ac: Implement Float64 UBO loads.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv/ac: Implement Float64 load/store var.
Bas Nieuwenhuizen [Sun, 8 Jan 2017 00:31:07 +0000 (01:31 +0100)]
radv/ac: Implement Float64 load/store var.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv/ac: Implement Float64 SSBO stores.
Bas Nieuwenhuizen [Thu, 5 Jan 2017 00:36:26 +0000 (01:36 +0100)]
radv/ac: Implement Float64 SSBO stores.

No f16 support as I'm not quite sure about alignment yet.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv/ac: Add core Float64 support.
Bas Nieuwenhuizen [Thu, 5 Jan 2017 00:09:12 +0000 (01:09 +0100)]
radv/ac: Add core Float64 support.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agovc4: Enable Neon on arm android builds
Rob Herring [Mon, 30 Jan 2017 22:54:53 +0000 (16:54 -0600)]
vc4: Enable Neon on arm android builds

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
7 years agovc4: fix arm64 build with Neon
Rob Herring [Mon, 30 Jan 2017 22:54:52 +0000 (16:54 -0600)]
vc4: fix arm64 build with Neon

The addition of Neon assembly breaks on arm64 builds because the assembly
syntax is different. For now, restrict Neon to ARMv7 builds.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
7 years agovc4: Make Neon inline assembly clang compatible
Rob Herring [Mon, 30 Jan 2017 22:54:51 +0000 (16:54 -0600)]
vc4: Make Neon inline assembly clang compatible

clang throws an error on "%r2" and similar. I couldn't find any
documentation on what "%r?" is supposed to mean and I've never seen any
use like that as far as I remember. The parameter is supposed to be
cpu_stride and just %2/%3 should be sufficient.

There's no need for trailing ";" either, so remove those, too.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
7 years agoradeonsi: Set datalayout on the llvm module
Tom Stellard [Thu, 15 Dec 2016 15:25:49 +0000 (15:25 +0000)]
radeonsi: Set datalayout on the llvm module

This prevents LLVM from using sext instructions for local memory offsets
and allows the backend to fold immediate offsets into the instruction.

This also prevents some incorrect code generation for ptrtoint and
inttoptr instructions.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agonir/spirv/glsl450: Implement IEEE-compliant handling of atan2(±∞, ±∞).
Francisco Jerez [Tue, 24 Jan 2017 07:36:46 +0000 (23:36 -0800)]
nir/spirv/glsl450: Implement IEEE-compliant handling of atan2(±∞, ±∞).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
7 years agoglsl: Implement IEEE-compliant handling of atan2(±∞, ±∞).
Francisco Jerez [Tue, 24 Jan 2017 21:43:07 +0000 (13:43 -0800)]
glsl: Implement IEEE-compliant handling of atan2(±∞, ±∞).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
7 years agonir/spirv/glsl450: Rewrite atan2 implementation to fix accuracy and handling of zero...
Francisco Jerez [Fri, 20 Jan 2017 23:24:30 +0000 (15:24 -0800)]
nir/spirv/glsl450: Rewrite atan2 implementation to fix accuracy and handling of zero/infinity.

See "glsl: Rewrite atan2 implementation to fix accuracy and handling
of zero/infinity." for the rationale, but note that the instruction
count benefit discussed there is somewhat less important for the SPIRV
implementation, because the current code already emitted no control
flow instructions -- Still this saves us one hardware instruction per
scalar component on Intel SKL hardware.

Fixes the following Vulkan CTS tests on Intel hardware:

    dEQP-VK.glsl.builtin.precision.atan2.highp_compute.scalar
    dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec2
    dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec3
    dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec4
    dEQP-VK.glsl.builtin.precision.atan2.mediump_compute.vec2
    dEQP-VK.glsl.builtin.precision.atan2.mediump_compute.vec4

Note that most of the test-cases above expect IEEE-compliant handling
of atan2(±∞, ±∞), which this patch doesn't explicitly handle, so
except for the last two the test-cases above weren't expected to pass
yet.  The reason they do is that the i965 back-end implementation of
the NIR fmin and fmax instructions is not quite GLSL-compliant (it
complies with IEEE 754 recommendations though), because fmin/fmax of a
NaN and a non-NaN argument currently always return the non-NaN
argument, which causes atan() to flush NaN to one and return the
expected value.  The front-end should probably not be relying on this
behavior for correctness though because other back-ends are likely to
behave differently -- A follow-up patch will handle the atan2(±∞, ±∞)
corner cases explicitly.

v2: Fix up argument scaling to take into account the range and
    precision of exotic FP24 hardware.  Flip coordinate system for
    arguments along the vertical line as if they were on the left
    half-plane in order to avoid division by zero which may give
    unspecified results on non-GLSL 4.1-capable hardware.  Sprinkle in
    some more comments.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
7 years agoglsl: Rewrite atan2 implementation to fix accuracy and handling of zero/infinity.
Francisco Jerez [Sat, 21 Jan 2017 21:41:08 +0000 (13:41 -0800)]
glsl: Rewrite atan2 implementation to fix accuracy and handling of zero/infinity.

This addresses several issues of the current atan2 implementation:

 - Negative zero (and negative denorms which end up getting flushed to
   zero) isn't handled correctly by the current implementation.  The
   reason is that it does 'y >= 0' and 'x < 0' comparisons to decide
   on which side of the branch cut the argument is, which causes us to
   return incorrect results (off by up to 2π) for very small negative
   values.

 - There is a serious precision problem for x values of large enough
   magnitude introduced by the floating point division operation being
   implemented as a mul+rcp sequence.  This can lead to the quotient
   getting flushed to zero in some cases introducing an error of over
   8e6 ULP in the result -- Or in the most catastrophic case will
   cause us to return NaN instead of the correct value ±π/2 for y=±∞
   and x very large.  We can fix this easily by scaling down both
   arguments when the absolute value of the denominator goes above
   certain threshold.  The error of this atan2 implementation remains
   below 25 ULP in most of its domain except for a neighborhood of y=0
   where it reaches a maximum error of about 180 ULP.

 - It emits a bunch of instructions including no less than three
   if-else branches per scalar component that don't seem to get
   optimized out later on.  This implementation uses about 13% less
   instructions on Intel SKL hardware and doesn't emit any control
   flow instructions.

v2: Fix up argument scaling to take into account the range and
    precision of exotic FP24 hardware.  Flip coordinate system for
    arguments along the vertical line as if they were on the left
    half-plane in order to avoid division by zero which may give
    unspecified results on non-GLSL 4.1-capable hardware.  Sprinkle in
    some more comments.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
7 years agoi965/fs: Fix nir_op_fsign of absolute value.
Francisco Jerez [Tue, 24 Jan 2017 20:26:54 +0000 (12:26 -0800)]
i965/fs: Fix nir_op_fsign of absolute value.

This does point at the front-end emitting silly code that could have
been optimized out, but the current fsign implementation would emit
bogus IR if abs was set for the argument (because it would apply the
abs modifier on an unsigned integer type), and we shouldn't rely on
the upper layer's optimization passes for correctness.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
7 years agoglsl/ir_builder: Add rcp builder.
Francisco Jerez [Tue, 24 Jan 2017 07:59:45 +0000 (23:59 -0800)]
glsl/ir_builder: Add rcp builder.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
7 years agoglsl: Fix constant evaluation of the rcp op.
Francisco Jerez [Tue, 24 Jan 2017 19:41:46 +0000 (11:41 -0800)]
glsl: Fix constant evaluation of the rcp op.

Will avoid a regression in a future commit that introduces some
additional rcp operations.  According to the GLSL 4.10 specification:

"Dividing by 0 results in the appropriately signed IEEE Inf."

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
7 years agomesa/program: Translate csel operation from GLSL IR.
Francisco Jerez [Tue, 24 Jan 2017 07:53:03 +0000 (23:53 -0800)]
mesa/program: Translate csel operation from GLSL IR.

This will be used internally by the GLSL front-end in order to
implement some built-in functions. Plumb it through MESA IR for
back-ends that rely on this translation pass.

v2: Add comment.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
7 years agoetnaviv: Set SE.CLIP registers, add margins for scissor/clip registers
Wladimir J. van der Laan [Fri, 25 Nov 2016 06:42:43 +0000 (06:42 +0000)]
etnaviv: Set SE.CLIP registers, add margins for scissor/clip registers

This fixes rendering of full-screen quads (and other screen-filling
geometry, e.g. ioquake3 walls up-close) on gc3000. It should be a no-op
on other hardware.

- It looks like SE_CLIP registers were not set at all.
  I'm amazed that rendering worked without them. Emit them to
  avoid issues on gc3000.

- Define constants
  ETNA_SE_SCISSOR_MARGIN_RIGHT (0x1119)
  ETNA_SE_SCISSOR_MARGIN_BOTTOM (0x1111)
  ETNA_SE_CLIP_MARGIN_RIGHT (0xffff)
  ETNA_SE_CLIP_MARGIN_BOTTOM (0xffff)

  These demarcate the margin (fixp16) between the computed sizes and the
  value sent to the chip. I have set these to the numbers used by the
  Vivante driver for gc2000. I am not sure whether any old hardware was
  relying on the old numbers, or whether those were just a guess. But if
  so, these need to be moved to the _specs structure.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>
7 years agoetnaviv: Generate new sin/cos instructions on GC3000
Wladimir J. van der Laan [Tue, 31 Jan 2017 08:23:51 +0000 (09:23 +0100)]
etnaviv: Generate new sin/cos instructions on GC3000

Shaders using sin/cos instructions were not working on GC3000.

The reason for this turns out to be that these chips implement sin/cos
in a different way (but using the same opcodes):

- Need their input scaled by 1/pi instead of 2/pi.

- Output an x and y component, which need to be multiplied to
  get the result.

- tex_amode needs to be set to 1.

Add a new bit to the compiler specs and generate these instructions
as necessary.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>
7 years agoanv/cmd_buffer: Use the proper depth input attachment surface state
Nanley Chery [Mon, 30 Jan 2017 20:27:15 +0000 (12:27 -0800)]
anv/cmd_buffer: Use the proper depth input attachment surface state

Commit 2852efcda40274acf3272611c6a3b7731523a72d moved the location of
the depth input attachment surface state from the render pass to the
image view, but failed to update the surface state location used when
emitting the binding table. Fix this by loading the surface state from
the correct location.

Fixes:
dEQP-VK.renderpass.formats.d16_unorm.input.*
dEQP-VK.renderpass.formats.d24_unorm_s8_uint.input.*
dEQP-VK.renderpass.formats.d32_sfloat.input.*
dEQP-VK.renderpass.formats.x8_d24_unorm_pack32.input.*
dEQP-VK.renderpass.attachment_allocation.input_output.93
dEQP-VK.renderpass.attachment_allocation.input_output.92
dEQP-VK.renderpass.attachment_allocation.input_output.82
dEQP-VK.renderpass.attachment_allocation.input_output.46

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
7 years agoglsl: fix heap-buffer-overflow
Bartosz Tomczyk [Tue, 31 Jan 2017 11:02:20 +0000 (12:02 +0100)]
glsl: fix heap-buffer-overflow

The `end+1` skips the ']', whereas the `strlen+1` includes the final
'\0' in the move to terminate the string.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoetnaviv: Cannot render to rb-swapped formats
Wladimir J. van der Laan [Wed, 7 Dec 2016 12:59:54 +0000 (12:59 +0000)]
etnaviv: Cannot render to rb-swapped formats

Exposing rb swapped (or other swizzled) formats for rendering would
involve swizzing in the pixel shader. This is not the case at the
moment, so reject requests for creating such surfaces.

(GPUs that need an extra resolve step anyway due to multiple pixel
pipes, such as gc2000, might also do this swap in the resolve operation.
But this would be tricky to keep track of)

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>
7 years agoetnaviv: Avoid infinite loop in find_frame()
Christian Gmeiner [Tue, 31 Jan 2017 08:10:27 +0000 (09:10 +0100)]
etnaviv: Avoid infinite loop in find_frame()

Use of unsigned loop control variable with '>= 0' would lead
to infinite loop.

Reported by clang:

etnaviv_compiler.c:1024:39: warning: comparison of unsigned expression
>= 0 is always true [-Wtautological-compare]
   for (unsigned sp = c->frame_sp; sp >= 0; sp--)
                                   ~~ ^  ~

v2: Simply use the same datatype as c->frame_sp is using.

CC: <mesa-stable@lists.freedesktop.org>
Reported-by: Rhys Kidd <rhyskidd@gmail.com>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
7 years agoradv/ac: apply slice rounding to 1d arrays as well.
Dave Airlie [Tue, 31 Jan 2017 00:09:11 +0000 (10:09 +1000)]
radv/ac: apply slice rounding to 1d arrays as well.

Fixes:
dEQP-VK.glsl.texture_functions.texture.*1darray*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv/geom: check if esgs and gsvs ring exists before filling geom rings
Dave Airlie [Tue, 31 Jan 2017 00:37:25 +0000 (10:37 +1000)]
radv/geom: check if esgs and gsvs ring exists before filling geom rings

There are some corner cases where you end up with an esgs ring, but no
gsvs ring, test for both before dereferencing.

Fixes:
dEQP-VK.geometry.emit.points_emit_0_end_0

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: enable geometryShader and multiViewport capabilities.
Dave Airlie [Fri, 20 Jan 2017 02:42:26 +0000 (12:42 +1000)]
radv: enable geometryShader and multiViewport capabilities.

This enables geometry shader support on radv.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: handle layer export from vs->fs properly
Dave Airlie [Mon, 30 Jan 2017 19:56:49 +0000 (05:56 +1000)]
radv: handle layer export from vs->fs properly

Fixes:
dEQP-VK.geometry.layered.1d_array.fragment_layer

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: emit esgs itemsize register.
Dave Airlie [Fri, 20 Jan 2017 02:41:19 +0000 (12:41 +1000)]
radv: emit esgs itemsize register.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: handle prim id inputs to fragment shader.
Dave Airlie [Fri, 20 Jan 2017 02:40:13 +0000 (12:40 +1000)]
radv: handle prim id inputs to fragment shader.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: emit geometry shaders to hardware
Dave Airlie [Fri, 20 Jan 2017 02:33:45 +0000 (12:33 +1000)]
radv: emit geometry shaders to hardware

This emits the compiled geometry shader and other state registers.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>