mesa.git
8 years agovirtio_gpu: Add PCI ID to driver map
Rob Herring [Thu, 17 Dec 2015 15:45:49 +0000 (09:45 -0600)]
virtio_gpu: Add PCI ID to driver map

Add the virtio-gpu PCI ID so the driver probing works.

Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoi965: Implement a drirc workaround for broken dual color blending.
Kenneth Graunke [Thu, 21 Jan 2016 01:33:14 +0000 (17:33 -0800)]
i965: Implement a drirc workaround for broken dual color blending.

OpenGL's dual color blending feature was specified so that an
implementation could support both multiple render targets (MRT) and
dual source blending.  Fragment shader outputs specify both "location"
(the render target number) and "index" (either color 0 or 1).

I believe DirectX only has the notion of "location" - if using dual
color blending, location 0 or 1 will specify the operands.  If not,
then location means the render target index.  The two features can't
be used together.

As such, some applications mistakenly try to use <loc = 0, index = 0>
and <loc = 1, index = 0> in a shader used for dual color blending with
a single render target, rather than the correct <loc = 0, index = 0>
and <loc = 0, index = 1>.

In particular, Unigine Heaven 4.0 and Valley 1.0 suffer from this bug.
Unigine is aware of the problem, and quickly developed a fix, but has
not bothered to change the download link on their website to a working
copy in over a year.  People were still using the broken version and
complaining.  We tried working around this by disabling dual color
blending, but that apparently hurts performance, and people were once
again unhappy.

On i965, dual source blending is achieved by using different framebuffer
write messages than normal rendering.  So, we have to compile different
code for the two cases.  We're not being pedantic: we actually have to
know in order to function.

Normally, dual source blending is detectable in the shader: if a shader
has an output with index = 1, then it's meant for blending, not MRT.
With the broken inputs, they're indistinguishable, so we can only tell
by looking at the current GL state.

This patch implements a new drirc workaround:

   export dual_color_blend_by_location=true

which makes the i965 driver detect when OpenGL state is configured for
dual source blending, and recompile the fragment shader to use the right
messages.  In that case, we allow either location = 1 or index = 1 to
specify the second source for the blending equations.

It also re-enables GL_ARB_blend_func_extended for Unigine.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92233
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agoradeonsi: add ETC1 support for Stoney
Marek Olšák [Fri, 22 Jan 2016 15:13:44 +0000 (16:13 +0100)]
radeonsi: add ETC1 support for Stoney

It's a subset of ETC2. Tested.

For more information, see page 42 and onward:
http://www.graphicshardware.org/previous/www_2007/presentations/strom-etc2-gh07.pdf

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
8 years agoradeonsi: change LLVM intrinsics for BREV, CLAMP, EX2
Marek Olšák [Thu, 21 Jan 2016 10:45:07 +0000 (11:45 +0100)]
radeonsi: change LLVM intrinsics for BREV, CLAMP, EX2

Requested by Matt Arsenault.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: add max waves / SIMD to shader stats (v2)
Marek Olšák [Wed, 20 Jan 2016 00:32:05 +0000 (01:32 +0100)]
radeonsi: add max waves / SIMD to shader stats (v2)

v2: account for LDS usage in PS
    the limit is per SIMD, not per CU

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: enable late VS allocation (v3)
Marek Olšák [Tue, 19 Jan 2016 23:01:31 +0000 (00:01 +0100)]
radeonsi: enable late VS allocation (v3)

v2: take the number of CUs into account
v3: change in LS allocation

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: allow using all CUs for tessellation and on-chip GS (v2)
Marek Olšák [Tue, 19 Jan 2016 22:29:32 +0000 (23:29 +0100)]
radeonsi: allow using all CUs for tessellation and on-chip GS (v2)

v2: After more discussion with hw teams, the kernel already contains the
    optimal settings allowing us to use all CUs.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoRevert "mesa: Deal with size differences between GLuint and GLhandleARB in GetAttache...
Jeremy Huddleston Sequoia [Fri, 22 Jan 2016 21:02:01 +0000 (13:02 -0800)]
Revert "mesa: Deal with size differences between GLuint and GLhandleARB in GetAttachedObjectsARB"

This reverts commit 739ac3d39dacdede853d150b9903001524453330.

This will be done a differnet way.
See http://lists.freedesktop.org/archives/mesa-dev/2016-January/105642.html

8 years agoi965/fs: Remove unused count from vs urb setup
Ben Widawsky [Thu, 21 Jan 2016 19:05:55 +0000 (11:05 -0800)]
i965/fs: Remove unused count from vs urb setup

This was originally removed here:
commit 031d3501322aee0a1474c7f2a9b79f9fa9947430
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Tue Aug 25 16:59:12 2015 -0700

    i965/vs: Unify URB entry size/read length calculations between backends.

Then added back:
commit bd198b9f0a292a9ff4ffffec3a29bad23d62caba
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Fri Aug 14 16:01:33 2015 -0700

    i965/vs: Simplify fs_visitor's ATTR file.

Note that the authorship dates are out of order, but the above reflects the
order of the commit dates.

Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agoRevert "radeonsi: fix discard-only fragment shaders (v2)"
Nicolai Hähnle [Fri, 22 Jan 2016 17:37:03 +0000 (12:37 -0500)]
Revert "radeonsi: fix discard-only fragment shaders (v2)"

This reverts commit 843855bbf0da2204ce536623ba957bfa83fdbd52.

It became redundant due to Marek's earlier pushed 8667a1ae which achieves
the same thing.

8 years agoradeonsi: fix discard-only fragment shaders (v2)
Nicolai Hähnle [Tue, 19 Jan 2016 19:59:22 +0000 (14:59 -0500)]
radeonsi: fix discard-only fragment shaders (v2)

When a fragment shader is used that has no outputs but does conditional
discard (KILL_IF), all fragments are killed without this patch.

By comparing various register settings, my conclusion is that the exec mask
is either not properly forwarded to the DB by NULL exports or ends up being
unused, at least when there is _only_ a NULL export (the ISA documentation
claims that NULL exports can be used to override a previously exported exec
mask).

Of the various approaches I have tried to work around the problem, this one
seems to be the least invasive one.

v2: take discard by alpha test into account as well

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93761
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agomesa: Update _mesa_has_geometry_shaders
Marta Lofstedt [Thu, 21 Jan 2016 15:17:32 +0000 (16:17 +0100)]
mesa: Update _mesa_has_geometry_shaders

Updates the _mesa_has_geometry_shaders function to also look
for OpenGL ES 3.1 contexts that has OES_geometry_shader enabled.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agoglsl: add support for GL_OES_geometry_shader
Marta Lofstedt [Thu, 21 Jan 2016 15:17:31 +0000 (16:17 +0100)]
glsl: add support for GL_OES_geometry_shader

This adds glsl support of GL_OES_geometry_shader for
OpenGL ES 3.1.

Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agomesa: enable enums for OES_geometry_shader
Marta Lofstedt [Thu, 21 Jan 2016 15:17:30 +0000 (16:17 +0100)]
mesa: enable enums for OES_geometry_shader

Enable GL_OES_geometry_shader enums for OpenGL ES 3.1.

V4: EXTRA tokens updated according to comments from Ilia Mirkin.

Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agoglapi: add GL_OES_geometry_shader extension
Marta Lofstedt [Thu, 21 Jan 2016 15:17:29 +0000 (16:17 +0100)]
glapi: add GL_OES_geometry_shader extension

Add xml definitions for the GL_OES_geometry_shader extension
and expose the extension for OpenGL ES 3.1.

Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agodocs: correct 11.1.1 release year
Emil Velikov [Fri, 22 Jan 2016 15:50:48 +0000 (15:50 +0000)]
docs: correct 11.1.1 release year

Seems like I wasn't ready to let 2015 go :-)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
8 years agodocs: add news item and link release notes for 11.0.9
Emil Velikov [Fri, 22 Jan 2016 15:49:47 +0000 (15:49 +0000)]
docs: add news item and link release notes for 11.0.9

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
8 years agodocs: add sha256 checksums for 11.0.9
Emil Velikov [Fri, 22 Jan 2016 15:40:17 +0000 (15:40 +0000)]
docs: add sha256 checksums for 11.0.9

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
8 years agodocs: add release notes for 11.0.9
Emil Velikov [Fri, 22 Jan 2016 14:51:19 +0000 (14:51 +0000)]
docs: add release notes for 11.0.9

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
8 years agoradeonsi: add ETC2 support for Stoney
Marek Olšák [Mon, 3 Aug 2015 19:47:38 +0000 (21:47 +0200)]
radeonsi: add ETC2 support for Stoney

Tested and working.

8 years agoradeonsi: implement SAMPLEPOS system value without a constant buffer load
Marek Olšák [Wed, 20 Jan 2016 00:45:21 +0000 (01:45 +0100)]
radeonsi: implement SAMPLEPOS system value without a constant buffer load

We always get per-sample input position.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agowinsys/amdgpu: compute num_good_compute_units correctly
Marek Olšák [Tue, 19 Jan 2016 16:43:11 +0000 (17:43 +0100)]
winsys/amdgpu: compute num_good_compute_units correctly

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agogallium/radeon: rename max_compute_units -> num_good_compute_units
Marek Olšák [Tue, 19 Jan 2016 16:24:57 +0000 (17:24 +0100)]
gallium/radeon: rename max_compute_units -> num_good_compute_units

radeon sets this correctly, but not amdgpu

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: disable SPI color outputs the shader doesn't write
Marek Olšák [Fri, 15 Jan 2016 20:58:53 +0000 (21:58 +0100)]
radeonsi: disable SPI color outputs the shader doesn't write

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: use all SPI color formats
Marek Olšák [Fri, 15 Jan 2016 13:40:19 +0000 (14:40 +0100)]
radeonsi: use all SPI color formats

because not using SPI_SHADER_32_ABGR doubles fill rate.

We should also get optimal performance if alpha isn't needed or blending
isn't enabled.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: use 32_AR for alpha-to-coverage without a color buffer
Marek Olšák [Sat, 16 Jan 2016 03:09:45 +0000 (04:09 +0100)]
radeonsi: use 32_AR for alpha-to-coverage without a color buffer

This avoids the fp16 packing instructions.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: add shader conversion code for all SPI color formats
Marek Olšák [Fri, 15 Jan 2016 13:36:53 +0000 (14:36 +0100)]
radeonsi: add shader conversion code for all SPI color formats

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: set CB_SHADER_MASK according to SPI color formats
Marek Olšák [Mon, 11 Jan 2016 23:52:12 +0000 (00:52 +0100)]
radeonsi: set CB_SHADER_MASK according to SPI color formats

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: use SPI_SHADER_COL_FORMAT fields instead of export_16bpc
Marek Olšák [Mon, 11 Jan 2016 22:51:39 +0000 (23:51 +0100)]
radeonsi: use SPI_SHADER_COL_FORMAT fields instead of export_16bpc

This does change the behavior slightly:
  If a shader writes COLOR[i] and that color buffer isn't bound,
  the shader will export MRT_NULL instead and discard the IR tree that
  calculates the output. The only exception is alpha-to-coverage, which
  requires an alpha export.

v2: - update a comment about 16BPC
    - account for MRTZ when when fixing alpha-test/kill

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: don't enable blending if colormask == 0
Marek Olšák [Fri, 15 Jan 2016 11:59:48 +0000 (12:59 +0100)]
radeonsi: don't enable blending if colormask == 0

most likely useless, but doesn't hurt

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoglsl: always compute proper varying type, irrespective of varying packing
Ilia Mirkin [Thu, 21 Jan 2016 12:17:06 +0000 (07:17 -0500)]
glsl: always compute proper varying type, irrespective of varying packing

Normally there's a producer and consumer, and the producer var gets
picked. In both the vertex->gs and tes->gs cases, that's the un-arrayed
version.

In the SSO case, however, there is no producer. So we picked the arrayed
GS variable, and as a result, used more slots than we should. More
critically, these slots would also no longer line up with the producer's
calculation. To fix this, we need to fix up the type of the variable
based on stage no matter what.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93650
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
8 years agoegl/dri2: expose srgb configs when KHR_gl_colorspace is available
Emil Velikov [Sun, 29 Nov 2015 16:48:51 +0000 (16:48 +0000)]
egl/dri2: expose srgb configs when KHR_gl_colorspace is available

Otherwise the user has no way of using it, and we'll try to access the
linear one.

v2:
 - Bail out when KHR_gl_colorspace is missing and srgb is set (Marek)

Cc: Chih-Wei Huang <cwhuang@android-x86.org>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Fixes: c2c2e9ab604(egl: implement EGL_KHR_gl_colorspace (v2))
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91596
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Mauro Rossi <issor.oruam@gmail.com>
8 years agotargets/dri: android: use WHOLE static libraries
Emil Velikov [Sun, 29 Nov 2015 16:38:54 +0000 (16:38 +0000)]
targets/dri: android: use WHOLE static libraries

By using whole static libraries the android buildsystem provides
whole-archive (alike) solution. This means that we don't need to worry
about the order of the static libraries and any reverse, recursive or
circular dependencies that they have between one another.

Without this the linker will discard any unused hunks of one library
and we'll end up with unresolved symbols as those are required by
another static library. This issue has become more prominent with the
introduction of pipe-loader.

Whole static libraries has been used in i915/i965 for a very long
time, so we might do the same.

v2:
 - Better commit message (Ilia)
 - Keep external dependencies as [normal] static libs (Mauro)

Cc: mesa-stable@lists.freedesktop.org
Cc: Mauro Rossi <issor.oruam@gmail.com>
Reported-by: Mauro Rossi <issor.oruam@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
8 years agoi915: correctly parse/set the context flags
Emil Velikov [Fri, 18 Dec 2015 15:28:03 +0000 (15:28 +0000)]
i915: correctly parse/set the context flags

With an earlier commit we've spit the flags parsing to a separate
function, but forgot to update all the dri modules to use it.

Noticed when we've enabled KHR_debug for every dri module - fdo#93048

Fixes: 38366c0c6e7 "dri_util: Don't assume __DRIcontext->driverPrivate
is a gl_context"
Cc: Mark Janes <mark.a.janes@intel.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
8 years agoglsl/lower_instructions: fix regression in dldexp_to_arith
Iago Toral Quiroga [Thu, 21 Jan 2016 09:46:39 +0000 (10:46 +0100)]
glsl/lower_instructions: fix regression in dldexp_to_arith

The commit b4e198f47f842 changed the offset and bits parameters of the
bitfield insert operation from scalars to vectors. However, the lowering
of ldexp on doubles operates on each vector component and emits scalar
code (since it has to deal with the lower and upper 32-bit chunks of
each double component), so it needs its bits and offset parameters to
be scalars.

Fixes fp64 regression (crash) in:
spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-ldexp-dvec4.shader_test

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agoi965/vec4/tcs: Return NULL instead of false in brw_compile_tcs()
Eduardo Lima Mitev [Thu, 21 Jan 2016 16:45:18 +0000 (17:45 +0100)]
i965/vec4/tcs: Return NULL instead of false in brw_compile_tcs()

brw_compile_tcs() is expected to return 'const unsigned *', so the compiler
complains.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agofreedreno/a4xx: Add support for adreno 430
cstout [Sat, 12 Dec 2015 00:58:45 +0000 (16:58 -0800)]
freedreno/a4xx: Add support for adreno 430

Signed-off-by: Rob Clark <robclark@freedesktop.org>
8 years agofreedreno: make opc array static const
Christian Gmeiner [Wed, 20 Jan 2016 21:11:52 +0000 (22:11 +0100)]
freedreno: make opc array static const

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
8 years agofreedreno: implement emit_string_marker
Rob Clark [Mon, 10 Aug 2015 16:11:13 +0000 (12:11 -0400)]
freedreno: implement emit_string_marker

Writes string to cmdstream in payload of a no-op packet.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
8 years agogallium: add GREMEDY_string_marker
Rob Clark [Mon, 10 Aug 2015 15:41:29 +0000 (11:41 -0400)]
gallium: add GREMEDY_string_marker

Since the GREMEDY extensions are normally only exposed by the gremedy
debugger (and could possibly trigger debug paths in the app), we don't
expose the extension by default, but instead only with
ST_DEBUG=gremedy.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agomesa: wire up EmitStringMarker for KHR_debug
Rob Clark [Sat, 5 Dec 2015 16:32:25 +0000 (11:32 -0500)]
mesa: wire up EmitStringMarker for KHR_debug

The extension spec[1] describes DEBUG_TYPE_MARKER as "Annotation of the
command stream".  So for DEBUG_TYPE_MARKER, also pass the buf to the
driver's EmitStringMarker() to be inserted in the command stream.

[1] https://www.opengl.org/registry/specs/KHR/debug.txt

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agomesa: add GREMEDY_string_marker
Rob Clark [Mon, 10 Aug 2015 14:37:53 +0000 (10:37 -0400)]
mesa: add GREMEDY_string_marker

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agotexobj: Remove redundant checks that the texture cube faces match size
Neil Roberts [Thu, 21 Jan 2016 17:28:07 +0000 (17:28 +0000)]
texobj: Remove redundant checks that the texture cube faces match size

The texture mipmap completeness checking code was checking whether all
of the faces have the same size. However this is pointless because the
code just above it checks whether the face has the expected size
calculated for the mipmap level anyway so the error condition could
never be reached. This patch just removes it.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agotexobj: Fix the completeness checks for cube textures
Neil Roberts [Thu, 21 Jan 2016 17:12:29 +0000 (17:12 +0000)]
texobj: Fix the completeness checks for cube textures

According to the GL 1.4 spec section 3.8.10, a cubemap texture is only
complete if:

• The level base arrays of each of the six texture images making up
  the cube map have identical, positive, and square dimensions.
• The level base arrays were each specified with the same internal
  format.
• The level base arrays each have the same border width.

Previously the texture completeness code was only checking the first
point. This patch makes it additionally check the other two.

This fixes the following two dEQP tests:

deqp-gles2.functional.texture.completeness.cube.format_mismatch_rgba_rgb_level_0_neg_z
deqp-gles2.functional.texture.completeness.cube.format_mismatch_rgb_rgba_level_0_pos_z

And also this Piglit test:

spec/!opengl 2.0/incomplete-cubemap-format

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93792
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agor600g: don't leak driver const buffers
Grazvydas Ignotas [Wed, 20 Jan 2016 23:52:24 +0000 (01:52 +0200)]
r600g: don't leak driver const buffers

The buffers are referenced from r600_update_driver_const_buffers()
 -> r600_set_constant_buffer() -> u_upload_data(), but nothing
ever releases the reference. Similar case with driver_consts.
Found using valgrind.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agomesa: Deal with size differences between GLuint and GLhandleARB in GetAttachedObjectsARB
Jeremy Huddleston Sequoia [Thu, 21 Jan 2016 01:10:54 +0000 (17:10 -0800)]
mesa: Deal with size differences between GLuint and GLhandleARB in GetAttachedObjectsARB

Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Reviewed-by: Nicolai Hähnle <nhaehnle@gmail.com>
8 years agomesa: Fix format warnings
Jeremy Huddleston Sequoia [Thu, 21 Jan 2016 01:03:26 +0000 (17:03 -0800)]
mesa: Fix format warnings

main/shaderapi.c:1318:51: warning: format specifies type 'unsigned int' but the argument has type 'GLhandleARB' (aka 'unsigned long') [-Wformat]
      _mesa_debug(ctx, "glDeleteObjectARB(%u)\n", obj);
                                          ~~      ^~~
                                          %lu

Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agomesa: Fix some function prototype mismatching
Jeremy Huddleston Sequoia [Thu, 21 Jan 2016 00:59:45 +0000 (16:59 -0800)]
mesa: Fix some function prototype mismatching

main/api_exec.c:543:36: warning: incompatible pointer types passing 'void (GLhandleARB, GLuint, const GLcharARB *)' (aka 'void (unsigned long, unsigned int, const char *)') to
parameter of
      type 'void (*)(GLuint, GLuint, const GLchar *)' (aka 'void (*)(unsigned int, unsigned int, const char *)') [-Wincompatible-pointer-types]
      SET_BindAttribLocation(exec, _mesa_BindAttribLocation);
                                   ^~~~~~~~~~~~~~~~~~~~~~~~
./main/dispatch.h:7590:88: note: passing argument to parameter 'fn' here
static inline void SET_BindAttribLocation(struct _glapi_table *disp, void (GLAPIENTRYP fn)(GLuint, GLuint, const GLchar *)) {
                                                                                       ^
main/api_exec.c:547:31: warning: incompatible pointer types passing 'void (GLhandleARB)' (aka 'void (unsigned long)') to parameter of type 'void (*)(GLuint)' (aka 'void (*)(unsigned
int)')
      [-Wincompatible-pointer-types]
      SET_CompileShader(exec, _mesa_CompileShader);
                              ^~~~~~~~~~~~~~~~~~~
./main/dispatch.h:7612:83: note: passing argument to parameter 'fn' here
static inline void SET_CompileShader(struct _glapi_table *disp, void (GLAPIENTRYP fn)(GLuint)) {
                                                                                  ^
main/api_exec.c:568:33: warning: incompatible pointer types passing 'void (GLhandleARB, GLuint, GLsizei, GLsizei *, GLint *, GLenum *, GLcharARB *)' (aka 'void (unsigned long,
unsigned int,
      int, int *, int *, unsigned int *, char *)') to parameter of type 'void (*)(GLuint, GLuint, GLsizei, GLsizei *, GLint *, GLenum *, GLchar *)' (aka 'void (*)(unsigned int,
unsigned int,
      int, int *, int *, unsigned int *, char *)') [-Wincompatible-pointer-types]
      SET_GetActiveAttrib(exec, _mesa_GetActiveAttrib);
                                ^~~~~~~~~~~~~~~~~~~~~
./main/dispatch.h:7711:85: note: passing argument to parameter 'fn' here
static inline void SET_GetActiveAttrib(struct _glapi_table *disp, void (GLAPIENTRYP fn)(GLuint, GLuint, GLsizei , GLsizei *, GLint *, GLenum *, GLchar *)) {
                                                                                    ^
main/api_exec.c:571:35: warning: incompatible pointer types passing 'GLint (GLhandleARB, const GLcharARB *)' (aka 'int (unsigned long, const char *)') to parameter of type
      'GLint (*)(GLuint, const GLchar *)' (aka 'int (*)(unsigned int, const char *)') [-Wincompatible-pointer-types]
      SET_GetAttribLocation(exec, _mesa_GetAttribLocation);
                                  ^~~~~~~~~~~~~~~~~~~~~~~
./main/dispatch.h:7744:88: note: passing argument to parameter 'fn' here
static inline void SET_GetAttribLocation(struct _glapi_table *disp, GLint (GLAPIENTRYP fn)(GLuint, const GLchar *)) {
                                                                                       ^
main/api_exec.c:585:33: warning: incompatible pointer types passing 'void (GLhandleARB, GLsizei, GLsizei *, GLcharARB *)' (aka 'void (unsigned long, int, int *, char *)') to
parameter of
      type 'void (*)(GLuint, GLsizei, GLsizei *, GLchar *)' (aka 'void (*)(unsigned int, int, int *, char *)') [-Wincompatible-pointer-types]
      SET_GetShaderSource(exec, _mesa_GetShaderSource);
                                ^~~~~~~~~~~~~~~~~~~~~
./main/dispatch.h:7788:85: note: passing argument to parameter 'fn' here
static inline void SET_GetShaderSource(struct _glapi_table *disp, void (GLAPIENTRYP fn)(GLuint, GLsizei, GLsizei *, GLchar *)) {
                                                                                    ^
main/api_exec.c:597:29: warning: incompatible pointer types passing 'void (GLhandleARB)' (aka 'void (unsigned long)') to parameter of type 'void (*)(GLuint)' (aka 'void (*)(unsigned
int)')
      [-Wincompatible-pointer-types]
      SET_LinkProgram(exec, _mesa_LinkProgram);
                            ^~~~~~~~~~~~~~~~~
./main/dispatch.h:7909:81: note: passing argument to parameter 'fn' here
static inline void SET_LinkProgram(struct _glapi_table *disp, void (GLAPIENTRYP fn)(GLuint)) {
                                                                                ^
main/api_exec.c:628:30: warning: incompatible pointer types passing 'void (GLhandleARB, GLsizei, const GLcharARB *const *, const GLint *)' (aka
      'void (unsigned long, int, const char *const *, const int *)') to parameter of type 'void (*)(GLuint, GLsizei, const GLchar *const *, const GLint *)' (aka 'void (*)(unsigned
int, int,
      const char *const *, const int *)') [-Wincompatible-pointer-types]
      SET_ShaderSource(exec, _mesa_ShaderSource);
                             ^~~~~~~~~~~~~~~~~~
./main/dispatch.h:7920:82: note: passing argument to parameter 'fn' here
static inline void SET_ShaderSource(struct _glapi_table *disp, void (GLAPIENTRYP fn)(GLuint, GLsizei, const GLchar * const *, const GLint *)) {
                                                                                 ^
main/api_exec.c:653:28: warning: incompatible pointer types passing 'void (GLhandleARB)' (aka 'void (unsigned long)') to parameter of type 'void (*)(GLuint)' (aka 'void (*)(unsigned
int)')
      [-Wincompatible-pointer-types]
      SET_UseProgram(exec, _mesa_UseProgram);
                           ^~~~~~~~~~~~~~~~
./main/dispatch.h:8173:80: note: passing argument to parameter 'fn' here
static inline void SET_UseProgram(struct _glapi_table *disp, void (GLAPIENTRYP fn)(GLuint)) {
                                                                               ^
main/api_exec.c:655:33: warning: incompatible pointer types passing 'void (GLhandleARB)' (aka 'void (unsigned long)') to parameter of type 'void (*)(GLuint)' (aka 'void (*)(unsigned
int)')
      [-Wincompatible-pointer-types]
      SET_ValidateProgram(exec, _mesa_ValidateProgram);
                                ^~~~~~~~~~~~~~~~~~~~~
./main/dispatch.h:8184:85: note: passing argument to parameter 'fn' here
static inline void SET_ValidateProgram(struct _glapi_table *disp, void (GLAPIENTRYP fn)(GLuint)) {

main/dlist.c:9457:26: warning: incompatible pointer types passing 'void (GLhandleARB)' (aka 'void (unsigned long)') to parameter of type 'void (*)(GLuint)' (aka 'void (*)(unsigned
int)')
      [-Wincompatible-pointer-types]
   SET_UseProgram(table, save_UseProgramObjectARB);
                         ^~~~~~~~~~~~~~~~~~~~~~~~
./main/dispatch.h:8173:80: note: passing argument to parameter 'fn' here
static inline void SET_UseProgram(struct _glapi_table *disp, void (GLAPIENTRYP fn)(GLuint)) {
                                                                               ^
1 warning generated.

Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoglapi: Build glapi_gentable.c only on Darwin
Andreas Boll [Wed, 9 Dec 2015 12:41:22 +0000 (13:41 +0100)]
glapi: Build glapi_gentable.c only on Darwin

Removes the public symbol _glapi_create_table_from_handle from
libGL.so.1.2.0 on all platforms except Darwin.

Since the symbol is not used on other platforms it makes sense to
build glapi_gentable.c only on Darwin.

As a side effect it accelerates the build a bit and reduces the size
of libGL.so.1.2.0 as follows:

size lib/libGL.so.1.2.0 on my system shows
   text    data     bss     dec     hex filename
 469211   21848    2720  493779   788d3 lib/libGL.so.1.2.0 before
 420988   11240    2720  434948   6a304 lib/libGL.so.1.2.0 after

A little bit of history:

_glapi_create_table_from_handle was introduced in

commit 85937f4c0d4a78d3a11e3c1fa6148640f2a9ad7b
Author: Jeremy Huddleston <jeremyhu@apple.com>
Date:   Thu Jun 9 16:59:49 2011 -0700

    glapi: Add API that can create a _glapi_table from a dlfcn handle

    Example usage:

    void *handle = dlopen(opengl_library_path, RTLD_LOCAL);
    struct _glapi_table *disp = _glapi_create_table_from_handle(handle,
"gl");

Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
and the only user in mesa was added in

commit f35913b96e743c5014e99220b1a1c5532a894d69
Author: Jeremy Huddleston <jeremyhu@apple.com>
Date:   Thu Jun 9 17:29:51 2011 -0700

    apple: Use _glapi_create_table_from_handle to initialize our
dispatch table

Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
gl_gentable.py was also used for XQuartz in xserver 1.11 - 1.14.

v2: Fix typos in commit message
    Add missing XORG_GLAPI_OUTPUTS += \ into src/mapi/glapi/gen/Makefile.am
    Add glapi_gentable.c to EXTRA_DIST for inclusion in the release
    tarball

v3: Fix commit message: s/gl_gentable.c/glapi_gentable.c/

Reported-by: Arlie Davis <arlied@google.com>
Cc: Jeremy Huddleston <jeremyhu@apple.com>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
8 years agomesa: Reduce libGL.so binary size by about 15%
Arlie Davis [Thu, 17 Sep 2015 22:19:24 +0000 (15:19 -0700)]
mesa: Reduce libGL.so binary size by about 15%

This patch significantly reduces the size of the libGL.so binary. It does
not change the (externally visible) behavior of libGL.so at all.

gl_gentable.py generates a function, _glapi_create_table_from_handle.
This function allocates a large dispatch table, consisting of 1300 or so
function pointers, and fills this dispatch table by doing symbol lookups
on a given shared library.  Previously, gl_gentable.py would generate a
single, very large _glapi_create_table_from_handle function, with a short
cluster of lines for each entry point (function).  The idiom it generates
was a NULL check, a call to snprintf, a call to dlsym / GetProcAddress,
and then a store into the dispatch table.  Since this function processes
a large number of entry points, this code is duplicated many times over.

We can encode the same information much more compactly, by using a lookup
table.  The previous total size of _glapi_create_table_from_handle on x64
was 125848 bytes.  By using a lookup table, the size of
_glapi_create_table_from_handle (and the related lookup tables) is reduced
to 10840 bytes.  In other words, this enormous function is reduced by 91%.
The size of the entire libGL.so binary (measured when stripped) itself drops
by 15%.

So the purpose of this change is to reduce the binary size, which frees up
disk space, memory, etc.

size lib/libGL.so.1.2.0 on my system shows (Andreas)
   text    data     bss     dec     hex filename
 565947   11256    2720  579923   8d953 lib/libGL.so.1.2.0 before
 469211   21848    2720  493779   788d3 lib/libGL.so.1.2.0 after

v2: Incorporate Matt's feedback.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Tested-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
8 years agonv50/ir: 64-bit splitting fixes
Ilia Mirkin [Tue, 19 Jan 2016 10:37:24 +0000 (05:37 -0500)]
nv50/ir: 64-bit splitting fixes

Take reading shader outputs into account, and use setFlagsDef for the
carry since we rely on having i->flagsDef being set.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agogk110/ir: allow carry to be set/read by imad
Ilia Mirkin [Tue, 19 Jan 2016 10:30:56 +0000 (05:30 -0500)]
gk110/ir: allow carry to be set/read by imad

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agogm107/ir: add carry emission to LOP and IADD
Ilia Mirkin [Tue, 19 Jan 2016 04:55:19 +0000 (23:55 -0500)]
gm107/ir: add carry emission to LOP and IADD

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agogm107/ir: add ATOM and CCTL support
Ilia Mirkin [Sun, 3 Jan 2016 07:11:51 +0000 (02:11 -0500)]
gm107/ir: add ATOM and CCTL support

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agogm107/ir: set LD/ST address width bit
Ilia Mirkin [Sat, 7 Nov 2015 08:30:26 +0000 (03:30 -0500)]
gm107/ir: set LD/ST address width bit

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agogk110/ir: fix double-wide vm address
Ilia Mirkin [Mon, 28 Dec 2015 20:59:03 +0000 (15:59 -0500)]
gk110/ir: fix double-wide vm address

8 years agogk110/ir: add OP_CCTL handling
Ilia Mirkin [Tue, 22 Sep 2015 23:13:14 +0000 (19:13 -0400)]
gk110/ir: add OP_CCTL handling

8 years agogk110/ir: add atomic op emission, fix gmem loads
Ilia Mirkin [Tue, 22 Sep 2015 00:00:36 +0000 (20:00 -0400)]
gk110/ir: add atomic op emission, fix gmem loads

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agollvmpipe: warn about illegal use of objects in different contexts
Roland Scheidegger [Tue, 19 Jan 2016 23:48:07 +0000 (00:48 +0100)]
llvmpipe: warn about illegal use of objects in different contexts

Doing that is clearly a bug. We can't quite assert as st/mesa may hit this,
but increase at least visibility of it a bit.
(For the non-refcounted objects it would be illegal too, but we can't detect
that unless we'd store the context ourselves. Plus, those don't tend to cause
random crashes at context or object destruction time... So just sampler views,
surfaces and so targets for now.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
8 years agollvmpipe,i915: add back NEW_RASTERIZER dependency when computing vertex info
Roland Scheidegger [Wed, 20 Jan 2016 23:04:56 +0000 (00:04 +0100)]
llvmpipe,i915: add back NEW_RASTERIZER dependency when computing vertex info

I removed this mistakenly in 2dbc20e45689e09766552517a74e2270e49817b5. I
actually thought it should not be necessary and a piglit run didn't show
any differences, but this shouldn't have been in there.
draw_prepare_shader_outputs() is in fact dependent on NEW_RASTERIZER.
The new polygon-mode-facing test indeed shows why this is necessary, there's
lots of invalid reads and writes with valgrind (also crashes without
valgrind), because the pre-pipeline vertex size doesn't match the
post-pipeline vertex size (note this won't help much with stages which don't
have the prepare hook which can grow the vertex size, in particular the wide
point stage, but this isn't used by llvmpipe). The test still won't pass, of
course, but it is only usage of uninitialized values now, which is much
less dangerous...
(Albeit I'm pretty sure for i915 it really is not needed anymore as it
doesn't care about the extra outputs and doesn't call
draw_prepare_shader_outputs().)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
8 years agonv50/ir: don't flip SHL(ADD) into ADD(SHL) if ADD sources have modifiers
Ilia Mirkin [Wed, 20 Jan 2016 22:59:34 +0000 (17:59 -0500)]
nv50/ir: don't flip SHL(ADD) into ADD(SHL) if ADD sources have modifiers

Fixes: 31fde8fa (nv50/ir: flip shl(add, imm) into add(shl, imm))
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agogk110/ir: fix load from shared memory
Ilia Mirkin [Wed, 20 Jan 2016 22:15:27 +0000 (17:15 -0500)]
gk110/ir: fix load from shared memory

It was accidentally using the store opcode.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agogk110/ir: add partial BAR support
Ilia Mirkin [Wed, 20 Jan 2016 22:12:59 +0000 (17:12 -0500)]
gk110/ir: add partial BAR support

This is enough for the plain TGSI BARRIER implementation.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agoRevert "glsl: move uniform calculation to link_uniforms"
Tapani Pälli [Wed, 20 Jan 2016 20:02:22 +0000 (22:02 +0200)]
Revert "glsl: move uniform calculation to link_uniforms"

This reverts commit 4475d8f9169195baefa893b9b147fe20414cda7c.

8 years agoglsl: move uniform calculation to link_uniforms
Tapani Pälli [Fri, 15 Jan 2016 11:11:20 +0000 (13:11 +0200)]
glsl: move uniform calculation to link_uniforms

Patch moves uniform calculation to happen during link_uniforms, this
is possible with help of UniformRemapTable that has all the reserved
locations.

Location assignment for implicit locations is changed so that we
utilize also the 'holes' that explicit uniform location assignment
might have left in UniformRemapTable, this makes it possible to fit
more uniforms as previously we were lazy here and wasting space.

Fixes following CTS tests:
   ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max
   ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max-array

v2: code cleanups, increment NumUniformRemapTable correctly, fix
    find_empty_block to work properly and add some more comments.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
8 years agoglsl: add missing explicit_image_format flag to has_layout()
Timothy Arceri [Tue, 19 Jan 2016 23:49:54 +0000 (10:49 +1100)]
glsl: add missing explicit_image_format flag to has_layout()

Fixes piglit regression after fixes to duplicate layout rules.

Previously catching multiple layouts was relying on the code
meant to catch duplicates within a single layout(...), this
change triggers the rules for multiple layouts.

Cc: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
8 years agollvmpipe: turn depth clears into full depth/stencil clears for d24x8 formats
Roland Scheidegger [Mon, 18 Jan 2016 03:29:22 +0000 (04:29 +0100)]
llvmpipe: turn depth clears into full depth/stencil clears for d24x8 formats

If we have a d24x8 format, there is no stencil. Therefore, we can always
clear these bits too, which means this will be some kind of memset rather
than read-modify-write.
This is good for some 7% increase or so in gears with huge window size -
seems to have a bigger effect if things aren't in caches. Of course, any
real app won't spend nearly as much time comparatively in clearing
depth buffer in the first place, so the speedup will be much lower.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
8 years agoi965: Implement compute sampler state atom.
Francisco Jerez [Sat, 16 Jan 2016 23:11:03 +0000 (15:11 -0800)]
i965: Implement compute sampler state atom.

Fixes a number of GLES31 CTS failures and hangs on various hardware:

 ES31-CTS.texture_gather.plain-gather-depth-2d
 ES31-CTS.texture_gather.plain-gather-depth-2darray
 ES31-CTS.texture_gather.plain-gather-depth-cube
 ES31-CTS.texture_gather.offset-gather-depth-2d
 ES31-CTS.texture_gather.offset-gather-depth-2darray
 ES31-CTS.layout_binding.sampler2D_layout_binding_texture_ComputeShader
 ES31-CTS.layout_binding.sampler2DArray_layout_binding_texture_ComputeShader
 ES31-CTS.explicit_uniform_location.uniform-loc-types-samplers
 ES31-CTS.compute_shader.resources-texture

Some of them were actually passing by luck on some generations even
though we weren't uploading sampler state tables explicitly for the
compute stage, most likely because they relied on the cached sampler
state left from previous rendering to be close enough.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92589
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93312
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93325
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93407
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93725
Reported-by: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
8 years agoi965: Trigger CS state reemission when new sampler state is uploaded.
Francisco Jerez [Sat, 16 Jan 2016 23:05:51 +0000 (15:05 -0800)]
i965: Trigger CS state reemission when new sampler state is uploaded.

This reuses the NEW_SAMPLER_STATE_TABLE state bit (currently only used
on pre-Gen7 hardware) to signal that the sampler state tables have
changed in order to make sure that the GPGPU interface descriptor is
updated.

Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
8 years agoglsl: Don't abbreviate tessellation shader stage names.
Kenneth Graunke [Fri, 1 Jan 2016 00:28:08 +0000 (16:28 -0800)]
glsl: Don't abbreviate tessellation shader stage names.

I have a patch that writes shaders as .shader_test files, and it uses
this function to create the headers (i.e. [vertex shader]).

[tess ctrl shader] isn't a valid shader_runner header - it's spelled
out as [tessellation control shader].

There's no real reason to abbreviate it, so spell it out.

v2: Rebase on Rob's patches to move the code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
8 years agomesa: remove link validation that should be done elsewhere
Timothy Arceri [Wed, 6 Jan 2016 01:40:12 +0000 (12:40 +1100)]
mesa: remove link validation that should be done elsewhere

Even if re-linking fails rendering shouldn't fail as the previous
succesfully linked program will still be available. It also shouldn't
be possible to have an unlinked program as part of the current rendering
state.

This fixes a subtest in:
ES31-CTS.sepshaderobjs.StateInteraction

This change should improve performance on CPU limited benchmarks as noted
in commit d6c6b186cf308f.

>From Section 7.3 (Program Objects) of the OpenGL 4.5 spec:

   "If a program object that is active for any shader stage is re-linked
    unsuccessfully, the link status will be set to FALSE, but any existing
    executables and associated state will remain part of the current rendering
    state until a subsequent call to UseProgram, UseProgramStages, or
    BindProgramPipeline removes them from use. If such a program is attached to
    any program pipeline object, the existing executables and associated state
    will remain part of the program pipeline object until a subsequent call to
    UseProgramStages removes them from use. An unsuccessfully linked program may
    not be made part of the current rendering state by UseProgram or added to
    program pipeline objects by UseProgramStages until it is successfully
    re-linked."

   "void UseProgram(uint program);

   ...

   An INVALID_OPERATION error is generated if program has not been linked, or
   was last linked unsuccessfully.  The current rendering state is not modified."

V2: apply the rule to both core and compat.

Cc: Tapani Pälli <tapani.palli@intel.com>
Cc: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agoglsl: allow multiple layout qualifiers for a single declaration
Timothy Arceri [Fri, 15 Jan 2016 01:43:10 +0000 (12:43 +1100)]
glsl: allow multiple layout qualifiers for a single declaration

From the ARB_shading_language_420pack spec:

   "More than one layout qualifier may appear in a single
   declaration. If the same layout-qualifier-name occurs in
   multiple layout qualifiers for the same declaration, the
   last one overrides the former ones."

The parser was already failing correctly when the extension is
not available but testing for duplicates within a single layout
qualifier was still causing this to fail when available as both
cases share the same function for merging.

Here we add a parameter to differentiate between the two uses
and apply it to the duplicate test.

Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
8 years agoglsl: update parser to allow duplicate default layout qualifiers
Timothy Arceri [Mon, 18 Jan 2016 06:06:57 +0000 (17:06 +1100)]
glsl: update parser to allow duplicate default layout qualifiers

In order to only create a single node for each default declaration
we add a new boolean parameter to the in/out merge function to
only create one once we reach the rightmost layout qualifier.

From the ARB_shading_language_420pack spec:

   "More than one layout qualifier may appear in a single
   declaration. If the same layout-qualifier-name occurs in
   multiple layout qualifiers for the same declaration, the
   last one overrides the former ones."

Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
8 years agoglsl: move default layout qualifier rules out of the parser
Timothy Arceri [Mon, 18 Jan 2016 08:13:03 +0000 (19:13 +1100)]
glsl: move default layout qualifier rules out of the parser

Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
8 years agoglsl: split layout_defaults into specific types
Timothy Arceri [Mon, 18 Jan 2016 05:09:06 +0000 (16:09 +1100)]
glsl: split layout_defaults into specific types

This will allow merging of duplicate layout qualifiers as allowed
by ARB_shading_language_420pack

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
8 years agoglsl: allow duplicate layout-qualifier-names
Timothy Arceri [Fri, 15 Jan 2016 00:01:25 +0000 (11:01 +1100)]
glsl: allow duplicate layout-qualifier-names

This is added by ARB_enhanced_layouts although it doesn't fit
into any of the six main changes so we enable this independently.

From the ARB_enhanced_layouts spec:

   "More than one layout qualifier may appear in a single
   declaration. Additionally, the same layout-qualifier-name
   can occur multiple times within a layout qualifier or across
   multiple layout qualifiers in the  same declaration"

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
8 years agoi965/vec4: Spaces around operators.
Matt Turner [Tue, 19 Jan 2016 20:12:38 +0000 (12:12 -0800)]
i965/vec4: Spaces around operators.

8 years agoi965: Inform compiler of variable range to silence warning.
Matt Turner [Fri, 15 Jan 2016 21:38:46 +0000 (13:38 -0800)]
i965: Inform compiler of variable range to silence warning.

Extends commit 6531ccb70 to silence the warning in release builds as
well.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agoglsl: Restore Mesa-style to shader_enums.c/h.
Matt Turner [Fri, 15 Jan 2016 21:31:34 +0000 (13:31 -0800)]
glsl: Restore Mesa-style to shader_enums.c/h.

8 years agost/va: fix motion adaptive deinterlacing
Christian König [Mon, 18 Jan 2016 19:56:06 +0000 (20:56 +0100)]
st/va: fix motion adaptive deinterlacing

Signed-off-by: Christian König <christian.koenig@amd.com>
8 years agoutil/u_pstipple.c: copy immediates during transformation
Nicolai Hähnle [Fri, 15 Jan 2016 21:56:15 +0000 (16:56 -0500)]
util/u_pstipple.c: copy immediates during transformation

Apparently, nobody has combined stippling with a fragment shader
containing immediates in almost five years...

Fixes a bug in Kodi with radeonsi reported by Christian König.

Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Tested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agomesa: Move sanity check of BindVertexBuffer for OpenGL ES 3.1
Marta Lofstedt [Fri, 8 Jan 2016 13:55:55 +0000 (14:55 +0100)]
mesa: Move sanity check of BindVertexBuffer for OpenGL ES 3.1

Sanity check of BindVertexBuffer for OpenGL ES in
_mesa_handle_bind_buffer_gen breaks OpenGL ES 2 conformance.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93426
Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
8 years agoglsl: fix interface block error message
Timothy Arceri [Tue, 19 Jan 2016 03:35:50 +0000 (14:35 +1100)]
glsl: fix interface block error message

Print the stream value not the pointer to the expression,
also use the unsigned format specifier.

Cc: 11.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agonv50/ir: swap the least-ref'd source into src1 when both const/imm
Ilia Mirkin [Mon, 18 Jan 2016 03:28:19 +0000 (22:28 -0500)]
nv50/ir: swap the least-ref'd source into src1 when both const/imm

The whole point of inlining sources is to reduce loads. We can end up in
a situation where one value is used a lot of times, and one value is
used only once per instruction. The once-per-instruction one is the one
that should get inlined, but with the previous algorithm, it was given
no preference.

This flips things around to preferring putting less-referenced values
into src1 which increases the likelihood of them being inlined.

While we're at it, adjust the heuristic to not treat 0 as an immediate,
as well as (effectively) check for situations where LIMMs can't be
loaded. All this yields improvements on nvc0:

total instructions in shared programs : 6261157 -> 6255985 (-0.08%)
total gprs used in shared programs    : 945082 -> 943417 (-0.18%)
total local used in shared programs   : 30372 -> 30288 (-0.28%)
total bytes used in shared programs   : 50089256 -> 50047880 (-0.08%)

                local        gpr       inst      bytes
    helped          21         822        3332        3332
      hurt           0         278         565         565

And more importantly avoids generating really bad code with SSBOs, where
we end up checking a lot of different values (usually immediates) against
the length.

On nv50 we get comparable results, and even improve packing (bytes went
down more than instructions):

total instructions in shared programs : 6346564 -> 6341277 (-0.08%)
total gprs used in shared programs    : 728719 -> 725131 (-0.49%)
total local used in shared programs   : 3552 -> 3552 (0.00%)
total bytes used in shared programs   : 43995688 -> 43932928 (-0.14%)

                local        gpr       inst      bytes
    helped           0        1380        3252        3774
      hurt           0         287        1710        1365

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agost/mesa: restore the stObj's size if it was cleared out
Ilia Mirkin [Sun, 17 Jan 2016 21:25:00 +0000 (16:25 -0500)]
st/mesa: restore the stObj's size if it was cleared out

An issue could still occur if the base level is set, but fixing that
would require a lot more logic.

This fixes the recently-failing texelFetch 3D tests because the mipmaps
were no longer being generated, which in turn caused the copying logic
to be hit, which in turn didn't work because of the broken
width/height/depth.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agofreedreno/a4xx: use smaller threadsize for more registers
Rob Clark [Mon, 18 Jan 2016 20:30:53 +0000 (15:30 -0500)]
freedreno/a4xx: use smaller threadsize for more registers

Once we go past half of the "GPR" register file, it seems like we need
to run frag shader with smaller threadsize.  (The vertex shader already
runs at TWO_QUADS, which is the minimum.)

Signed-off-by: Rob Clark <robclark@freedesktop.org>
8 years agofreedreno: per-generation OUT_IB packet
Rob Clark [Mon, 18 Jan 2016 20:22:27 +0000 (15:22 -0500)]
freedreno: per-generation OUT_IB packet

Some a4xx firmware doesn't implement the "PFD" (prefetch-disabled)
version of the CP_INDIRECT_BUFFER packet.  So allow for PFD vs PFE per
generation.  Switch a3xx and a4xx over to using prefetch-enabled version
(which is also what blob does.. it seems only on a2xx we cannot use
PFE).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
8 years agogallium: bundle the compat header u_pwr8.h in the tarball
Emil Velikov [Mon, 18 Jan 2016 11:34:14 +0000 (13:34 +0200)]
gallium: bundle the compat header u_pwr8.h in the tarball

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
8 years agomapi: include gl.xml in the tarball
Emil Velikov [Mon, 18 Jan 2016 11:06:28 +0000 (13:06 +0200)]
mapi: include gl.xml in the tarball

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
8 years agoi965: adding missing headers to the dist tarball
Emil Velikov [Thu, 14 Jan 2016 07:28:21 +0000 (09:28 +0200)]
i965: adding missing headers to the dist tarball

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
8 years agost/va: add motion adaptive deinterlacing v2
Christian König [Sun, 13 Dec 2015 10:44:13 +0000 (11:44 +0100)]
st/va: add motion adaptive deinterlacing v2

v2: minor cleanup

Signed-off-by: Christian König <christian.koenig@amd.com>
8 years agogallium/radeon: Rename do_invalidate_resource to invalidate_buffer
Michel Dänzer [Fri, 15 Jan 2016 07:02:22 +0000 (16:02 +0900)]
gallium/radeon: Rename do_invalidate_resource to invalidate_buffer

And only call it from r600_invalidate_resource for buffer resources.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agost/dri: Don't call invalidate_resource for NULL depth/stencil buffers
Michel Dänzer [Fri, 15 Jan 2016 06:46:31 +0000 (15:46 +0900)]
st/dri: Don't call invalidate_resource for NULL depth/stencil buffers

Fixes crash in 4 EGL piglit tests with radeonsi.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agoradeonsi: Avoid warning about LLVM generating R_0286D0_SPI_PS_INPUT_ADDR
Michel Dänzer [Fri, 15 Jan 2016 03:18:29 +0000 (12:18 +0900)]
radeonsi: Avoid warning about LLVM generating R_0286D0_SPI_PS_INPUT_ADDR

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
8 years agoradeonsi: Print "LLVM emitted unknown config register" warning only once
Michel Dänzer [Fri, 15 Jan 2016 03:13:15 +0000 (12:13 +0900)]
radeonsi: Print "LLVM emitted unknown config register" warning only once

Say "LLVM" instead of "Compiler" for clarity.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agollvmpipe: use vpkswss when dst is signed
Oded Gabbay [Sun, 17 Jan 2016 20:15:40 +0000 (22:15 +0200)]
llvmpipe: use vpkswss when dst is signed

This patch fixes a bug when building a pack instruction.

For POWER (altivec), in case the destination is signed and the
src width is 32, we need to use vpkswss. The original code used vpkuwus,
which emits an unsigned result.

This fixes the following piglit tests on ppc64le:
- spec@arb_color_buffer_float@gl_rgba8-drawpixels
- shaders@glsl-fs-fogscale

I've also corrected some coding style issues in the function.

v2: Returned else statements to vmware style

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
8 years agoglsl: fix subroutine lowering reusing actual parmaters
Dave Airlie [Sun, 17 Jan 2016 04:23:35 +0000 (14:23 +1000)]
glsl: fix subroutine lowering reusing actual parmaters

One of the oglconform tests was crashing here, and it was
due to not cloning the actual parameters before creating the
new call. This makes a call clone function that does the right
things to make sure we clone all the needed info, and points
the callee at it. (It differs from ->clone due to this).

this may fix https://bugs.freedesktop.org/show_bug.cgi?id=93722, I had this
patch in my cts fixes tree, but hadn't had time to make sure I liked it.

Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
8 years agoglsl: remove special case for detecting stream duplicates
Timothy Arceri [Fri, 15 Jan 2016 02:45:49 +0000 (13:45 +1100)]
glsl: remove special case for detecting stream duplicates

Any duplicates in a single declaration will already fail the
generic duplicates test due to the explicit_stream flag being set.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
8 years agoglsl: add missing explicit_stream flag to has_layout()
Timothy Arceri [Fri, 15 Jan 2016 02:45:48 +0000 (13:45 +1100)]
glsl: add missing explicit_stream flag to has_layout()

This will allow the ARB_shading_language_420pack rules in
glsl_parser.yy for catching duplicate layout qualifiers to be
triggered for the stream identifier rather than relying on the
code meant to catch duplicates within a single layout(...)

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
8 years agomesa: fix segfault in glUniformSubroutinesuiv()
Timothy Arceri [Sun, 17 Jan 2016 05:09:08 +0000 (16:09 +1100)]
mesa: fix segfault in glUniformSubroutinesuiv()

From Section 7.9 (SUBROUTINE UNIFORM VARIABLES) of the OpenGL
4.5 Core spec:

   "The command

       void UniformSubroutinesuiv(enum shadertype, sizei count,
                                  const uint *indices);

   will load all active subroutine uniforms for shader stage
   shadertype with subroutine indices from indices, storing
   indices[i] into the uniform at location i. The indices for
   any locations between zero and the value of
   ACTIVE_SUBROUTINE_UNIFORM_LOCATIONS minus one which are not
   used will be ignored."

V2: simplify NULL check suggested by Jason.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: "11.0 11.1" mesa-stable@lists.freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=93731