Roland Scheidegger [Thu, 8 Aug 2013 00:34:32 +0000 (02:34 +0200)]
gallivm: honor d3d10's wishes of out-of-bounds behavior for texture size query
Specifically, must return 0 for non-existent mip levels (and non-existent
textures which is an unsolved problem) for everything but total mip count.
Reviewed-by: Zack Rusin <zackr@vmware.com>
Paul Berry [Tue, 6 Aug 2013 19:17:17 +0000 (12:17 -0700)]
glsl: Enable ARB_fragment_coord_conventions functionality in GLSL 1.50.
GLSL 1.50 incorporates the functionality of the
ARB_fragment_coord_conventions extension, so we need to make this
functionality available even if the extension isn't enabled.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Paul Berry [Mon, 5 Aug 2013 22:46:43 +0000 (15:46 -0700)]
main: Fix deprecation of glLineWidth()
From section E.1 (Profiles and Deprecated Features of OpenGL 3.0)
of the OpenGL 3.0 spec:
"LineWidth is not deprecated, but values greater than 1.0
will generate an INVALID VALUE error"
From context it is clear that values greater than 1.0 should only
generate an INVALID VALUE error in a forward-compatible context.
The code was correctly quoting this spec text, but it was disallowing
all line widths in forward-compatible contexts, instead of just widths
greater than 1.0.
This patch introduces the correct check, so that setting a line width
of 1.0 or less is permitted.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Roland Scheidegger [Fri, 9 Aug 2013 15:29:52 +0000 (17:29 +0200)]
util: (trivial) fix asm input/output list for fxsave
Otherwise gcc might do very unsafe optimizations, spotted by Uros Bizjak.
Hopefully this time it's finally right?
Alex Deucher [Fri, 9 Aug 2013 01:11:22 +0000 (21:11 -0400)]
r600g: disable GPUVM by default
Cayman and trinity systems still seem to suffer from
stability problems with GPUVM. This also fixes compute
on these asics. It can still be enabled for testing
by setting env var RADEON_VA=true.
Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=65958
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CC: "9.2" <mesa-stable@lists.freedesktop.org>
CC: "9.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Zack Rusin [Fri, 9 Aug 2013 00:51:11 +0000 (20:51 -0400)]
softpipe: fix the regressions
softpipe has a really weird handling of the draw attrs, lets
just not inject outputs in its data.
Trivial.
Zack Rusin [Thu, 8 Aug 2013 19:44:10 +0000 (15:44 -0400)]
draw: rewrite primitive assembler
We can't be injecting the primitive id's in the pipeline because
by that time the primitives have already been decomposed. To
properly number the primitives we need to handle the adjacency
primitives by hand. This patch moves the prim id injection into
the original primitive assembler and completely removes the
useless pipeline stage.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Zack Rusin [Wed, 7 Aug 2013 00:25:53 +0000 (20:25 -0400)]
draw: reset the vertex id when injecting new primitive id
Without reseting the vertex id, with primitives where the same
vertex is used with different primitives (e.g. tri/lines strips)
our vbuf module won't re-emit those vertices with the changed
primitive id. So lets reset the vertex id whenever injecting
new primitive id to make sure that the vertex data is correctly
emitted.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Zack Rusin [Wed, 7 Aug 2013 00:24:26 +0000 (20:24 -0400)]
draw: cleanup the extra attribs
Before inserting new front face and prim id outputs cleanup
the old extra outputs, otherwise our cache will use previous
output slots which will break as soon as outputs of the current
shader don't match the last.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Dieter Nützel [Thu, 8 Aug 2013 23:23:09 +0000 (01:23 +0200)]
util: (trivial) fix more compile errors in u_cpu_detect (gcc/x86 this time).
Oops. Should fix https://bugs.freedesktop.org/show_bug.cgi?id=67921
Chad Versace [Thu, 1 Aug 2013 15:10:31 +0000 (08:10 -0700)]
egl: Do not export private symbols
libEGL was incorrectly exporting *all* symbols, public and private.
This patch adds -fvisibility=hidden to libEGL's linker flags to ensure
that only symbols annotated with __attribute__((visibility("default")))
get exported.
Sanity-checked with libEGL's builtin DRI2 driver and the i965 DRI driver
by running Piglit on X/EGL and by running weston-gears on Weston as an
X client.
Sanity-checked with libEGL's Gallium driver (which is not built-in) and
the swrast Gallium driver by running es2gears_x11.
Kristian reviewed the symbol diff in `nm libEGL.so`.
CC: "9.2" <mesa-stable@lists.freedesktop.org>
CC: Ian Romanick <idr@freedesktop.org>
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Kenneth Graunke [Tue, 6 Aug 2013 21:36:09 +0000 (14:36 -0700)]
i965: Remember to call intel_prepare_render() before blitting.
Otherwise, blits to the window system buffer may cause crashes,
since dst_irb->mt may be NULL.
This code is lifted straight out of brw_blorp_framebuffer()'s
try_blorp_blit() helper.
Fixes crashes in Piglit's fbo-sys-blit on systems without BLORP.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65919
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Roland Scheidegger [Thu, 8 Aug 2013 17:08:57 +0000 (19:08 +0200)]
util: (trivial) fix compile error with MSVC on x86
Roland Scheidegger [Wed, 7 Aug 2013 18:51:52 +0000 (20:51 +0200)]
gallivm: honor d3d10 floating point rules for shadow comparisons
d3d10 specifies ordered comparisons for everything but not_equal which is
unordered (http://msdn.microsoft.com/en-us/library/windows/desktop/
cc308050.aspx).
OpenGL probably doesn't care.
Reviewed-by: Zack Rusin <zackr@vmware.com>
Roland Scheidegger [Wed, 7 Aug 2013 18:33:54 +0000 (20:33 +0200)]
softpipe: don't clamp reference value for shadow comparison for float formats
Clamping is only done for fixed-point formats as part of conversion to
texture format.
Reviewed-by: Zack Rusin <zackr@vmware.com>
Roland Scheidegger [Wed, 7 Aug 2013 18:25:38 +0000 (20:25 +0200)]
gallivm: don't clamp reference value for shadow comparison for float formats
This is wrong both for OpenGL and d3d. (In fact clamping is a side effect
of converting to depth format, so this should really do quantization too
at least in d3d10 for the comparisons to be truly correct.)
Reviewed-by: Zack Rusin <zackr@vmware.com>
Roland Scheidegger [Wed, 7 Aug 2013 15:09:45 +0000 (17:09 +0200)]
gallivm: propagate scalar_lod to emit_size_query too
Clearly the returned values need to be per-element if the lod is per element.
Does not actually change behavior yet.
Reviewed-by: Zack Rusin <zackr@vmware.com>
Roland Scheidegger [Wed, 7 Aug 2013 15:03:45 +0000 (17:03 +0200)]
gallium: clarify SVIEWINFO opcode
This opcode is quite problematic in tgsi, while it tries to mirror
d3d10 resinfo it can't really do what's stated there due to missing
the crazy return type modifiers. Hence specify this is ignored along
with the swizzle.
(Other options would be to have multiple opcodes or specify the ret
type modifier maybe in dst_reg as there's padding bits left there but
it is the only instruction allowing this.)
Reviewed-by: Zack Rusin <zackr@vmware.com>
Roland Scheidegger [Tue, 6 Aug 2013 18:50:47 +0000 (20:50 +0200)]
gallivm: fix out-of-bounds behavior for fetch/ld
For d3d10 and ARB_robust_buffer_access_behavior, we are required to return
0 for out-of-bounds coordinates (for which we can just enable the code already
there was just disabled). Additionally, also need to return 0 for
out-of-bounds mip level and out-of-bounds layer. This changes the logic
so instead of clamping the level/layer, an out-of-bound mask is computed
instead in this case (actual clamping then can be omitted just like with
coordinates, since we set the fetch offset to zero if that happens anyway).
Reviewed-by: Zack Rusin <zackr@vmware.com>
Roland Scheidegger [Tue, 6 Aug 2013 01:30:37 +0000 (03:30 +0200)]
util: try much harder to set DAZ flag
While so far this only causes some harmless test failures, there's lots more
cpus with DAZ. All 64bit capable ones can do it (particularly relevant for
AMD cpus as they supported sse3 very very late) but if really necessary we
can check support for that for real with some more magic.
(In fact just about ANY cpu with sse2 can support DAZ, I believe the only
exception are first gen P4 (Willamette) and from those only early steppings
which can't do it it's almost like intel forgot to add it... - a real pity
though docs say you can't just try to set it as they will throw a GPF.)
While this was meant to address https://bugs.freedesktop.org/show_bug.cgi?id=67672
it does not fix it. Most likely the tests need fixing as I don't think
there's any guarantee about denorm handling in the reference math library
functions if the flags aren't set to standard values. Nevertheless enabling
DAZ on all cpus which can do it should be the right thing to do.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Roland Scheidegger [Tue, 6 Aug 2013 14:55:47 +0000 (16:55 +0200)]
util: implement table-based + linear interpolation linear-to-srgb conversion
Should be much faster, seems to work in softpipe.
While here (also it's now disabled) fix up the pow factor - the former value
is what is in GL core it is however not actually accurate to fp32 standard
(as it is 1.0/2.4), and if someone would do all the accurate math there's no
reason to waste 8 mantissa bits or so...
v2: use real table generating function instead of just printing the values
(might take a bit longer as it does calculations on some 3+ million floats
but much more descriptive obviously).
Also fix up another inaccurate pow factor (this time in the python code) -
wondering where the couple one bit errors came from :-(.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
Roland Scheidegger [Fri, 2 Aug 2013 22:24:29 +0000 (00:24 +0200)]
gallivm: fix comment wrt srgb accuracy.
I think it's actually not good enough now...
Chia-I Wu [Thu, 8 Aug 2013 05:43:57 +0000 (13:43 +0800)]
ilo: get rid of GPE tables completely
Move the estimate functions out of the tables and kill the tables.
Chia-I Wu [Thu, 8 Aug 2013 05:30:54 +0000 (13:30 +0800)]
ilo: clean up GPE header inclusions
This reduces the number of source files need to be recompiled when GPE
functions are changed other than regular clean ups.
Chia-I Wu [Thu, 8 Aug 2013 05:18:17 +0000 (13:18 +0800)]
ilo: initialize alpha test state in ilo_gpe_init_dsa
This could speed up BLEND_STATE and COLOR_CALC_STATE emission a bit.
Chia-I Wu [Thu, 8 Aug 2013 05:10:24 +0000 (13:10 +0800)]
ilo: fold gen6_translate_index_size into the caller
There is only one caller so fold it.
Chia-I Wu [Thu, 8 Aug 2013 05:01:39 +0000 (13:01 +0800)]
ilo: fold gen6_translate_depth_format into the caller
There is only one caller so fold it.
Courtney Goeltzenleuchter [Mon, 5 Aug 2013 21:57:31 +0000 (15:57 -0600)]
ilo: Call GPE emit functions directly.
Eliminate pipeline and GPE function vectors and have the pipeline functions
call the GPE emit functions directly.
Courtney Goeltzenleuchter [Mon, 5 Aug 2013 20:17:31 +0000 (14:17 -0600)]
ilo: move emit functions so that they can be inlined.
Tom Stellard [Thu, 8 Aug 2013 00:26:17 +0000 (17:26 -0700)]
r300g/compiler/tests: Pass the required LDFLAGS when building the test program
CC: "9.2 <mesa-stable@lists.freedesktop.org>"
Tom Stellard [Thu, 8 Aug 2013 00:26:01 +0000 (17:26 -0700)]
r300g/compiler/tests: Fix segfault
CC: "9.2" <mesa-stable@lists.freedesktop.org>
Kristian Høgsberg [Wed, 7 Aug 2013 18:19:59 +0000 (11:19 -0700)]
gallium-egl: Commit the rest of the native_wayland_drm_bufmgr_helper v2 patch
I missed Anders v2 on the list which fixed non-wayland compilation:
http://lists.freedesktop.org/archives/mesa-dev/2013-July/042062.html
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Ander Conselvan de Oliveira [Thu, 18 Jul 2013 12:11:25 +0000 (15:11 +0300)]
egl: Update to Wayland 1.2 server API
Since Wayland 1.2, struct wl_buffer and a few functions are deprecated.
References to wl_buffer are replaced with wl_resource and some getter
functions and calls to deprecated functions are replaced with the proper
new API. The latter changes are related to resource versioning.
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Ander Conselvan de Oliveira [Thu, 18 Jul 2013 12:11:24 +0000 (15:11 +0300)]
gallium-egl: Don't add a listener for wl_drm twice in wayland platform
A listener is added just after the interface is bound, in
registry_handle_global().
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Ander Conselvan de Oliveira [Thu, 18 Jul 2013 12:11:23 +0000 (15:11 +0300)]
gallium-egl: Simplify native_wayland_drm_bufmgr_helper interface
The helper provides a series of functions to easy the implementation
of the WL_bind_wayland_display extension on different platforms. But
even with the helpers there was still a bit of duplicated code between
platforms, with the drm authentication being the only part that
differs.
This patch changes the bufmgr interface to provide a self contained
object with a create function that takes a drm authentication callback
as an argument. That way all the helper functions are made static and
the "_helper" suffix was removed from the sources file name.
This change also removes the mix of Wayland client and server code in
the wayland drm platform source file. All the uses of libwayland-server
are now contained in native_wayland_drm_bufmgr.c.
Changes to the drm platform are only compile tested.
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Chia-I Wu [Wed, 7 Aug 2013 04:21:26 +0000 (12:21 +0800)]
ilo: speed up 3DSTATE_VERTEX_BUFFERS emission a bit
Ignore vbuffer_mask which does not gain us anything.
Chia-I Wu [Wed, 7 Aug 2013 09:16:46 +0000 (17:16 +0800)]
ilo: skip state emission when reducing sampler count
When the number of sampler states bound is reduced, we are good to keep
referencing the old SAMPLER_STATE array and skip emitting a new one.
Chia-I Wu [Wed, 7 Aug 2013 08:42:53 +0000 (16:42 +0800)]
ilo: simplify setting of shader samplers and views
Remove the special path that unbinds all samplers/views not in the range.
Just make another call to unbind them.
Chia-I Wu [Wed, 7 Aug 2013 09:32:38 +0000 (17:32 +0800)]
ilo: correctly check for stencil ref change
I intended to do a memcmp(), not a memcpy()...
Zack Rusin [Tue, 6 Aug 2013 06:54:36 +0000 (02:54 -0400)]
draw: fix slot detection
Nowadays -1 for slots means that the semantic is not present, so
we need to store it in a signed variables, otherwise <0 comparisons
are pointless. Fixes
http://bugzilla.eng.vmware.com/show_bug.cgi?id=67811 (at least
with softpipe, edgeflags don't work wit llvmpipe)
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Laurent Carlier [Tue, 6 Aug 2013 22:05:25 +0000 (00:05 +0200)]
gallivm: Fix build - Remove TargetOptions.RealignStack for llvm>=3.4
Since llvm -3.4svn r187618, TargetOptions doesn't provide
RealignStack, so only enable it with llvm<3.4
This option must now be specified using function attributes, see LLVM
commit r187618
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Kenneth Graunke [Thu, 1 Aug 2013 22:11:40 +0000 (15:11 -0700)]
i965: Add #defines for the MI_LOAD_REGISTER_MEM command.
This command reads a value from memory and writes it to a register (the
opposite of MI_STORE_REGISTER_MEM). It's only available on Gen7+.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Kenneth Graunke [Thu, 1 Aug 2013 22:11:39 +0000 (15:11 -0700)]
i965: Initialize the intel_context::bufmgr pointer earlier.
This prevents a crash in a future patch.
_mesa_initialize_context() creates a default transform feedback object
by calling the NewTransformFeedbackObject() driver hook. Eventually,
we'll want to subclass that and allocate a buffer object. This means
passing brw->bufmgr to drm_intel_alloc_bo(), and crashing if it isn't
initialized yet.
The buffer manager is actually already initialized; we just hadn't
copied the pointer from intel_screen to intel_context quite early
enough.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Kenneth Graunke [Thu, 1 Aug 2013 22:11:38 +0000 (15:11 -0700)]
i965: Tidy preprocessor macros for SO_PRIM_STORAGE_NEEDED registers.
Gen7+ supports four transform feedback streams. Using a function-like
macro makes it easy to access them by stream number or loop over them.
"GEN7_" prefixes are more common than "_IVB" suffixes, so use that.
Gen6 only supports a single stream, so the single #define should be
fine. However, SO_NUM_PRIM_STORAGE_NEEDED was a poor name. For one,
the word "NUM" doesn't appear in the actual name of the register.
It's also confusingly generic, as it doesn't exist on Gen7+. Add a
"GEN6_" prefix for clarity.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Kenneth Graunke [Thu, 1 Aug 2013 22:11:37 +0000 (15:11 -0700)]
i965: Tidy preprocessor macros for SO_NUM_PRIMS_WRITTEN registers.
Gen7+ supports four transform feedback streams. Using a function-like
macro makes it easy to access them by stream number or loop over them.
"GEN7_" prefixes are more common than "_IVB" suffixes, so we use that.
Gen6 only supports a single stream, so the single #define should be
fine. However, SO_NUM_PRIMS_WRITTEN was confusingly generic, as it
doesn't exist on Gen7+. Add a "GEN6_" prefix for clarity.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Christoph Bumiller [Tue, 6 Aug 2013 20:20:25 +0000 (22:20 +0200)]
nvc0: don't access array out of bounds on unexpected sample count
Emil Velikov [Fri, 21 Jun 2013 17:04:55 +0000 (18:04 +0100)]
nv50: handle pure integer vertex attributes
And as a side effect fix a crash in the following piglit test:
general/attribs GL3
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "9.2 and 9.1" mesa-stable@lists.freedesktop.org
Samuel Pitoiset [Thu, 25 Jul 2013 08:35:36 +0000 (10:35 +0200)]
nvc0: implement MP performance counters for nvc0:nvd9
Samuel Pitoiset [Thu, 25 Jul 2013 08:35:35 +0000 (10:35 +0200)]
nvc0: implement compute support for nvc0
Tested on nvc0, nvc1, nvcf and nvd9.
Samuel Pitoiset [Thu, 25 Jul 2013 08:35:34 +0000 (10:35 +0200)]
nvc0: add more MP counters for nve4
Ian Romanick [Sun, 28 Jul 2013 20:08:27 +0000 (13:08 -0700)]
mesa: Generate a renderbuffer wrapper even if the texture has no image
This prevents a segfault in check_begin_texture_render when an FBO is
rebound while in this state. This fixes the piglit test
fbo-incomplete-invalid-texture.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.1 9.2" mesa-stable@lists.freedesktop.org
Ian Romanick [Sat, 27 Jul 2013 19:27:45 +0000 (12:27 -0700)]
mesa: Validate the layer selection of an array texture too
Previously only the slice of a 3D texture was validated in the FBO
completeness check. This fixes the failure in the 'invalid layer of an
array texture' subtest of piglit's fbo-incomplete test.
v2: 1D_ARRAY textures have Depth == 1. Instead, compare against Height.
v3: Handle CUBE_MAP_ARRAY textures too. Noticed by Marek.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.1 9.2" mesa-stable@lists.freedesktop.org
Ian Romanick [Sat, 27 Jul 2013 19:16:56 +0000 (12:16 -0700)]
mesa: Don't call driver RenderTexture for invalid zoffset
This fixes the segfault in the 'invalid slice of 3D texture' and
'invalid layer of an array texture' subtests of piglit's fbo-incomplete
test.
The 'invalid layer of an array texture' subtest still fails.
v2: Fix off-by-one comparison error noticed by Chris Forbes. Also,
1D_ARRAY textures have Depth == 1. Instead, compare against Height.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1]
Cc: "9.1 9.2" mesa-stable@lists.freedesktop.org
Ian Romanick [Sat, 27 Jul 2013 19:04:20 +0000 (12:04 -0700)]
mesa: Don't call driver RenderTexture for really broken textures
This fixes the segfault in the '0x0 texture' subtest of piglit's
fbo-incomplete test.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.1 9.2" mesa-stable@lists.freedesktop.org
Ian Romanick [Sat, 27 Jul 2013 19:03:45 +0000 (12:03 -0700)]
mesa: Remove stray debug printfs in attachment completeness code
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.1 9.2" mesa-stable@lists.freedesktop.org
Ian Romanick [Fri, 19 Jul 2013 00:39:22 +0000 (17:39 -0700)]
mesa: Treat glBindFramebuffer and glBindFramebufferEXT more correctly
Allow user-generated names for glBindFramebufferEXT on desktop GL.
Disallow its use altogether for core profiles.
Names bound with glBindFramebuffer in desktop OpenGL are still
(incorrectly) shared across the share group instead of being
per-context. This gets us a bit closer to being strictly conformant.
v2: Disallow glBindFramebufferEXT in 3.1 by not installing it in the
dispatch table. Suggested by Jordan.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1]
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> [v1]
Cc: mesa-stable@lists.freedesktop.org
Ian Romanick [Fri, 19 Jul 2013 00:38:16 +0000 (17:38 -0700)]
mesa: Treat glBindRenderbuffer and glBindRenderbufferEXT correctly
Allow user-generated names for glBindRenderbufferEXT on desktop GL.
Disallow its use altogether for core profiles.
v2: Disallow glBindRenderbufferEXT in 3.1 by not installing it in the
dispatch table. Suggested by Jordan.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1]
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> [v1]
Cc: mesa-stable@lists.freedesktop.org
Michel Dänzer [Tue, 6 Aug 2013 08:45:50 +0000 (10:45 +0200)]
radeonsi: Number of SGPRs retrieved from LLVM already includes VCC
Fixes spurious 'Assertion `num_sgprs <= 104' failed.' with shaders using
all 104 SGPRs.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Christian König <christian.koenig@amd.com>
Kenneth Graunke [Fri, 2 Aug 2013 07:11:10 +0000 (00:11 -0700)]
i965: Don't allocate curbe buffers on Gen6+.
These are only used on Gen4-5. Why waste the 8kB of space?
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Vinson Lee [Sun, 4 Aug 2013 08:18:28 +0000 (01:18 -0700)]
llvmpipe: Do not need to free anything if there is no geometry shader.
If gs is null, then freeing state->shader.tokens would result in a null
dereference.
Fixes "Dereference after null check" defect reported by Coverity.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Vinson Lee [Sun, 4 Aug 2013 07:13:53 +0000 (00:13 -0700)]
nvc0: Initialize ptr for unexpected sample_count on release builds.
Fixes "Uninitialized pointer read" defect reported by Coverity.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Vinson Lee [Tue, 6 Aug 2013 00:33:51 +0000 (17:33 -0700)]
draw: Change slot from unsigned to int.
unfilled_stage::face_slot is of type int.
Fixes "Unsigned compared against 0" defect reported by Coverity.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Vinson Lee [Sat, 3 Aug 2013 06:39:24 +0000 (23:39 -0700)]
postprocess: Check ppq is null before calling pp_free_bos.
pp_free_bos dereferences ppq without a null check.
Fixes "Dereference before null check" defect reported by Coverity.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Zack Rusin [Sat, 3 Aug 2013 06:56:19 +0000 (02:56 -0400)]
draw: add back separate input assembler
the issue is that stream output is run before the pipeline, which
means that unless we decompose the primitives before the so
then things crash. we could convert the entire stream output
code into a pipeline stage but it will take a bit, so for now
fix the crashes by simply re-adding the old input assembler
which is run before the SO.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Zack Rusin [Fri, 2 Aug 2013 06:25:42 +0000 (02:25 -0400)]
draw: implement proper primitive assembler as a pipeline stage
we used to have a face primitive assembler that we ran after if
the gs was missing but we had adjacency primitives in the pipeline,
lets convert it to a pipeline stage, which allows us to use it
to inject outputs (primitive id) into the vertices. it's also
a lot cleaner because the decomposition is already handled for us.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Zack Rusin [Fri, 2 Aug 2013 05:50:05 +0000 (01:50 -0400)]
draw: fix front face injection
Inject front face only if the fragment shader uses it and
propagate through all channels because otherwise we'll
need to figure out the exact swizzle that the fs expects and
it's just simpler to make sure all the components within
the front face register are correctly set.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Brian Paul [Fri, 2 Aug 2013 14:00:54 +0000 (08:00 -0600)]
tgsi: remove unneeded File == TGSI_FILE_INPUT test
We're already in an "if (File == TGSI_FILE_INPUT)" block at that point.
Brian Paul [Mon, 5 Aug 2013 14:19:36 +0000 (08:19 -0600)]
tgsi: clean up tgsi_scan_shader() function
Replace "fulldecl->Semantic.Name/Index" with semName/semIndex.
Simplify if/else logic for TGSI_FILE_OUTPUT code.
Remove old comment.
Fix indentation.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Zack Rusin [Sat, 3 Aug 2013 02:08:25 +0000 (22:08 -0400)]
llvmpipe: fix frontface behavior again
Lets make sure the frontface is 1 for front and -1 for back.
Discussed with Roland and Jose.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Vinson Lee [Sun, 4 Aug 2013 06:58:43 +0000 (23:58 -0700)]
r600g/sb: Dump correct value for CND.
Fixes "Copy-paste error" reported by Coverity.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Vadim Girlin <vadimgirlin@gmail.com>
Jordan Justen [Mon, 29 Jul 2013 20:48:26 +0000 (13:48 -0700)]
intel_fbo: remove unused intel_renderbuffer hiz functions
We are now using functions that operate on the renderbuffer
attachment to handle layered rendering.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Jordan Justen [Mon, 29 Jul 2013 20:58:03 +0000 (13:58 -0700)]
i965 clear/draw: set renderbuffer attachment as needing depth resolve
Previously we would mark a renderbuffer as needing a depth resolve.
But, to support layered rendering, we need to look at the attachment
instead, since the attachment knows if layered rendering is being
used.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Jordan Justen [Mon, 29 Jul 2013 20:51:31 +0000 (13:51 -0700)]
i965: add intel_renderbuffer_att_set_needs_depth_resolve
This function is needed to support layered rendering. With
layered rendering, the attachment stores the state of whether
layered rendering is being used.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Jordan Justen [Mon, 29 Jul 2013 20:54:47 +0000 (13:54 -0700)]
i965: add intel_miptree_set_all_slices_need_depth_resolve
This function marks all slices of a renderbuffer at a particular
level as needing a depth resolve.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Jordan Justen [Fri, 19 Apr 2013 08:13:31 +0000 (01:13 -0700)]
i965 gen7: don't set FORCE_ZERO_RTAINDEX for layered rendering
When layered rendering is being used, we should not set
FORCE_ZERO_RTAINDEX in the clip state to allow render target
array values other than zero to be used.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Jordan Justen [Mon, 15 Jul 2013 23:37:15 +0000 (16:37 -0700)]
hsw hiz: Remove x/y offset restriction for hiz
This restriction was related to programming the offset fields
of the depth buffer packet. We are now setting these offsets
to 0 now, so this restriction should no longer be required.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Jordan Justen [Tue, 9 Jul 2013 22:36:32 +0000 (15:36 -0700)]
gen7 depth surface: program 3DSTATE_DEPTH_BUFFER to top of surface
Previously we would always find the 2D sub-surface of interest,
and then program the surface to this location. Now we always
program the 3DSTATE_DEPTH_BUFFER at the start of the surface.
To select the lod/slice, we utilize the lod & minimum array
element fields.
As part of this change, we must revert
1f112ccf:
Revert "i965/gen7: Align all depth miplevels to 8 in the X direction."
We also must disable brw_workaround_depthstencil_alignment for
gen >= 7. Now the hardware will handle alignment when rendering
to additional slices/LODs.
v2:
* Merge with recent MOCS changes
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Jordan Justen [Fri, 19 Jul 2013 22:44:56 +0000 (15:44 -0700)]
gen7 fbo: make unmatched depth/stencil configs return unsupported
For gen >= 7, we will use the lod/minimum-array-element fields to
support layered rendering. This means that we must restrict
the depth & stencil attachments to match in various more retrictive
ways. (Now the width, height, depth, LOD and layer must match)
The reason width, height, and depth must match is that the hardware
has a single set of width, height, and depth settings (in
3DSTATE_DEPTH_BUFFER) that affect both the depth and stencil buffers.
Since these controls determine the miptree layout, they need to be
set correctly in order for lod and minimum-array-element to work
properly. So the only way rendering can work is if the width,
height, and depth match.
In the future, if this restriction proves to be a problem (say
because some crucial client application relies on rendering to
different levels/layers of stencil and depth buffers), then we can
always work around the restriction by copying depth and/or stencil
data to a temporary buffer prior to rendering (much in the same way
that brw_workaround_depthstencil_alignment() does today for
gen < 7), but hopefully that won't be necessary.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Jordan Justen [Tue, 16 Jul 2013 07:01:05 +0000 (00:01 -0700)]
hsw hiz: Add new size restrictions for miplevels > 0
When performing hiz ops, we must ensure that the region sizes
have an 8 aligned width and 4 aligned height. We can tweak the
size for blorp hiz operations at LOD 0, but for the others we
can't. Therefore, we disable hiz for these miplevels if they
don't meet the size alignment requirements.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Jordan Justen [Tue, 9 Jul 2013 22:32:42 +0000 (15:32 -0700)]
gen7 blorp depth: calculate base surface width/height
This will be used in 3DSTATE_DEPTH_BUFFER in a later patch.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Jordan Justen [Tue, 9 Jul 2013 22:24:56 +0000 (15:24 -0700)]
gen7 depth surface: calculate minimum array element being rendered
In layered rendering this will be 0. Otherwise it will be the
selected slice.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Jordan Justen [Tue, 9 Jul 2013 22:19:55 +0000 (15:19 -0700)]
gen7 depth surface: calculate LOD being rendered to
This will be used in 3DSTATE_DEPTH_BUFFER in a later patch.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Jordan Justen [Tue, 9 Jul 2013 22:16:35 +0000 (15:16 -0700)]
gen7 depth surface: calculate depth (array size) for depth surface
This will be used in 3DSTATE_DEPTH_BUFFER in a later patch.
Note: Cube maps are treated as 2D arrays with 6 times as
many array elements as the cube map array would have.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Jordan Justen [Tue, 9 Jul 2013 21:56:38 +0000 (14:56 -0700)]
gen7 depth surface: calculate more specific surface type
This will be used in 3DSTATE_DEPTH_BUFFER in a later patch.
Note: Cube maps are treated as 2D arrays with 6 times as
many array elements as the cube map array would have.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Jordan Justen [Tue, 9 Jul 2013 21:25:11 +0000 (14:25 -0700)]
i965: init global state first in brw_workaround_depthstencil_alignment
In a future pass this will allow us to exit-early from this
routine to disable it for gen >= 7.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Ilia Mirkin [Thu, 1 Aug 2013 16:50:10 +0000 (12:50 -0400)]
nv50: fix some h264 interlaced decoding on vp2
Some videos specify mb_adaptive_frame_field_flag instead of
field_pic_flag. This implies that the pic height needs to be halved, and
this field needs to be passed to the VP engine.
Cc: "9.2" mesa-stable@lists.freedesktop.org
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Zack Rusin [Fri, 2 Aug 2013 05:53:15 +0000 (01:53 -0400)]
llvmpipe: don't interpolate front face or prim id
The loop was iterating over all the fs inputs and setting them
to perspective interpolation, then after the loop we were
creating extra output slots with the correct interpolation. Instead
of injecting bogus extra outputs, just set the interpolation
on front face and prim id correctly when doing the initial scan
of fs inputs.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Zack Rusin [Fri, 2 Aug 2013 05:45:45 +0000 (01:45 -0400)]
draw: make sure clipping works with injected outputs
clipping would drop the extra outputs because it always
used the number of standard vertex shader outputs, without
geometry shader or extra outputs. The commit makes sure
that clipping with geometry shaders which have more outputs
than the current vertex shader and with extra outputs correctly
propagates the entire vertex.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Zack Rusin [Wed, 31 Jul 2013 11:34:49 +0000 (07:34 -0400)]
draw: inject frontface info into wireframe outputs
Draw module can decompose primitives into wireframe models, which
is a fancy word for 'lines', unfortunately that decomposition means
that we weren't able to preserve the original front-face info which
could be derived from the original primitives (lines don't have a
'face'). To fix it allow draw module to inject a fake face semantic
into outputs from which the backends can figure out the original
frontfacing info of the primitives.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Zack Rusin [Fri, 2 Aug 2013 05:39:35 +0000 (01:39 -0400)]
draw: stop crashing with extra shader outputs
Draw sometimes injects extra shader outputs (aa points, lines or
front face), unfortunately most of the pipeline and llvm code
didn't handle them at all. It only worked if number of inputs
happened to be bigger or equal to the number of shader outputs
plus the extra injected outputs. In particular when running
the pipeline which depends on the vertex_id in the vertex_header
things were completely broken. The patch adjust the code to
correctly use the total number of shader outputs (the standard
ones plus the injected ones) to make it all stop crashing and
work.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Zack Rusin [Fri, 2 Aug 2013 05:48:36 +0000 (01:48 -0400)]
draw: use the vertex size
Instead of using the magical 4 use the above computed
vertex size. Doesn't change the behavior, just makes the code
a bit cleaner.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Zack Rusin [Fri, 2 Aug 2013 05:43:43 +0000 (01:43 -0400)]
draw/llvm: add some extra debugging output
when dumping shader outputs it's nice to have the integer
values of the outputs, in particular because some values
are integers.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Zack Rusin [Fri, 2 Aug 2013 05:24:41 +0000 (01:24 -0400)]
tgsi: detect prim id and front face usage in fs
Adding code to detect the usage of prim id and front face
semantics in fragment shaders.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Zack Rusin [Fri, 2 Aug 2013 23:06:46 +0000 (19:06 -0400)]
tgsi: add ucmp to the list of opcodes
we forgot to add ucmp to the list of opcodes, so it was never
generated for ureg.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Zack Rusin [Fri, 2 Aug 2013 19:50:16 +0000 (15:50 -0400)]
llvmpipe: make the front-face behavior match the gallium spec
The spec says that front-face is true if the value is >0 and false
if it's <0. To make sure that we follow the spec, lets just
subtract 0.5 from our value (llvmpipe did 1 for frontface and 0
otherwise), which will get us a positive num for frontface and
negative for backface.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Matt Turner [Thu, 1 Aug 2013 21:29:05 +0000 (14:29 -0700)]
Makefile.am: Remove api_exec_es* from EXTRA_FILES.
These files were removed in commits
a0102154 and
a8ab7e33.
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Matt Turner [Mon, 11 Mar 2013 21:57:16 +0000 (14:57 -0700)]
mesa: Use MIN3 instead of two MIN2s.
Matt Turner [Thu, 1 Aug 2013 21:20:23 +0000 (14:20 -0700)]
mesa: Update comments to match newer specs.
Old GL 1.x specs used 'b' but newer specs use 'p'. The line immediately
above the second hunk also uses 'p'.
Kenneth Graunke [Fri, 2 Aug 2013 07:01:41 +0000 (00:01 -0700)]
i965: Initialize the maximum number of GS threads on Haswell.
We'll need proper values for max_gs_threads when we eventually support
geometry shaders. Also, we initialize it for every other platform.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Kenneth Graunke [Fri, 2 Aug 2013 08:28:58 +0000 (01:28 -0700)]
glsl: Disallow interpolation qualifiers on non-input/output variables.
Commit
2548092ad8015 switched the sense of interpolation qualifier
checks in order to permit them on geometry shader in/out variables.
In doing so, it accidentally allowed interpolation qualifiers to be
applied to ordinary variables and function parameters.
Fixes a regression in Piglit's local-smooth-01.frag.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>