Francisco Jerez [Sun, 20 Oct 2013 21:11:27 +0000 (14:11 -0700)]
i965/gen7: Expose ARB_shader_atomic_counters.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Francisco Jerez [Wed, 11 Sep 2013 19:14:46 +0000 (12:14 -0700)]
glsl: Linker support for ARB_shader_atomic_counters.
v2: Add comments on the purpose of the auxiliary data structures.
Check for atomic counter overlaps. Use the contains_atomic()
convenience method. Add static assert with the number of expected
shader stages.
v3: Don't resize atomic arrays.
v4: Add comment on the reason why we don't resize atomic counter
arrays. Use 'strcmp(...) == 0' instead of '!strcmp(...)'.
v5 (idr): Don't use STL in the linker.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Francisco Jerez [Sun, 20 Oct 2013 19:38:07 +0000 (12:38 -0700)]
glsl: Implement parser support for atomic counters.
v2: Mark atomic counters as read-only variables. Move offset overlap
code to the linker. Use the contains_atomic() convenience method.
v3: Use pointer to integer instead of non-const reference. Add
comment so we remember to add a spec quotation from the next GLSL
release once the issue of atomic counter aggregation within
structures is clarified.
v4 (idr): Don't use std::map because it's overkill. Add an assertion
that ctx->Const.MaxAtomicBufferBindings <= MAX_COMBINED_ATOMIC_BUFFERS.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Kenneth Graunke [Wed, 23 Oct 2013 05:55:33 +0000 (22:55 -0700)]
Revert "i965: Add support for GL_AMD_performance_monitor on Ironlake."
This reverts most of commit
0f2da773070c06b6d20ad264d3abb19c4dfd9761.
(I chose to leave the additions to brw_defines.h.)
My previous Ironlake implementation was somewhat broken: counter data
was global, rather than per-context. This meant that performance
monitors captured data from your compositor, 2D driver, and other 3D
programs.
Originally, I believed that Sandybridge and later had an easy way to
avoid this problem (setting per-context flags in OACONTROL), while
Ironlake did not. So I'd intended to leave it as a known limitation of
performance monitoring support on Ironlake. However, this turned out
not to be true.
Unfortunately, our hardware only has one set of aggregating performance
counters shared between all 3D programs, and their values are not saved
or restored by hardware contexts. Also, at least on Sandybridge and
Ivybridge, the counters lose their values if the GPU goes to sleep.
To work around both of these problems, we have to snapshot the
performance counters at the beginning and end of each batch, similar to
how we handle query objects on platforms that don't support hardware
contexts.
For occlusion queries, this batch bookending approach is fairly simple:
only one occlusion query can be active at a time, and the result is a
single integer. Performance monitors are more complex: an arbitrary
number of monitors can be active at a time, each monitoring some subset
of our ~30 observability counters. Individual monitors can be started
and stopped at any point during the batch. Tracking where each monitor
started/ended relative to batch flushes ends up being a pain. And you
can run out of space in the buffer.
Properly supporting this required some serious rearchitecting of the
code. Rather than writing patches to try and morph a broken system into
a working one (which operates quite differently), I decided it would be
simplest to revert the old code and start fresh. Parts will look
familiar, but other parts are new.
I also decided it would be best to include Sandybridge and Ivybridge
support from the start, since the newer platforms have added complexity
that I wanted to make sure worked. They're also what most people care
about these days.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Kenneth Graunke [Thu, 7 Nov 2013 22:39:23 +0000 (14:39 -0800)]
glsl: Enable dFdx, dFdy, and fwidth by default in GLSL ES 3.00.
Previously, we only exposed them in desktop GL or with:
#extension GL_OES_standard_derivatives : enable
GLSL ES 3.00 includes these without an extension, so we need to expose
them by default.
Note that the above #extension line results in an error or desktop GL,
so we don't need to worry about this.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Fredrik Höglund [Thu, 7 Nov 2013 20:58:36 +0000 (21:58 +0100)]
docs: Mark off ARB_vertex_type_10f_11f_11f_rev for r600g
...and update relnotes.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Fredrik Höglund [Thu, 7 Nov 2013 20:49:43 +0000 (21:49 +0100)]
r600g: Add support for PIPE_FORMAT_R11G11B10_FLOAT vertex elements
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Fredrik Höglund [Thu, 7 Nov 2013 20:48:34 +0000 (21:48 +0100)]
st/mesa: Add support for ARB_vertex_type_10f_11f_11f_rev
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Brian Paul [Thu, 7 Nov 2013 22:23:34 +0000 (15:23 -0700)]
mesa: fix return statements in varray.c
Return false, not GL_FALSE. Add missing return value.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71359
Brian Paul [Thu, 7 Nov 2013 21:29:50 +0000 (14:29 -0700)]
svga: always return 4 for PIPE_MAX_COLOR_BUFS
Even if the query returns 8, only 4 really work.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Wed, 6 Nov 2013 00:24:22 +0000 (17:24 -0700)]
svga: return true for the PIPE_CAP_SM3 query
This just tells the state tracker to turn on the GL_ARB_shader_texture_lod
extension. This simply allows the GLSL compiler to emit TXL and TXD
instructions for both vertex and fragment shaders. We already support
these opcodes in the svga driver. Though, the shadow2DGrad() Piglit
tests are failing.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Matt Turner [Mon, 4 Nov 2013 20:06:17 +0000 (12:06 -0800)]
i965: Add an implementation of intel_miptree_map using streaming loads.
Improves performance of RoboHornet's 2D Canvas toDataURL benchmark
[http://www.robohornet.org/#e=canvastodataurl] by approximately 5x
on Baytrail on ChromiumOS.
Elapsed time drops by -81.4861% +/- 1.22619% (n=3 s=14.9105, confidence=95%).
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Matt Turner [Mon, 4 Nov 2013 20:02:24 +0000 (12:02 -0800)]
mesa: Add a streaming load memcpy implementation.
Uses SSE 4.1's MOVNTDQA instruction (streaming load) to read from
uncached memory without polluting the cache.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Chris Forbes [Thu, 7 Nov 2013 20:57:26 +0000 (09:57 +1300)]
docs: Mark off some more things.
These have been supported on i965/Gen7+ for a while, and are listed
in the 10.0 release notes.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Anuj Phogat [Mon, 4 Nov 2013 22:48:51 +0000 (14:48 -0800)]
i965: Fix 'SIMD16 only' dispatch of fragment shader in case of sample shading
This patch make changes to correctly set up the Dispatch GRF Start
Register in case of 'SIMD16 only' FS dispatch.
This fixes an issue of incorrect rendering on dolphin emulator with
GL_SAMPLE_SHADING enabled.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Chris Forbes [Thu, 7 Nov 2013 20:02:53 +0000 (09:02 +1300)]
docs: update relnotes
Chris Forbes [Thu, 7 Nov 2013 09:46:22 +0000 (22:46 +1300)]
docs: Mark off ARB_vertex_type_10f_11f_11f_rev.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Chris Forbes [Thu, 7 Nov 2013 08:26:15 +0000 (21:26 +1300)]
i965: Enable ARB_vertex_type_10f_11f_11f_rev on Gen6+.
This theoretically works on earlier hardware as well, but the extension
requires at least GL3.0.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Chris Forbes [Thu, 7 Nov 2013 10:19:30 +0000 (23:19 +1300)]
i965: add support for UNSIGNED_INT_10F_11F_11F_REV vertex attribs
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Chris Forbes [Thu, 7 Nov 2013 09:24:06 +0000 (22:24 +1300)]
vbo: add 10_11_11 support to vbo_attrib_tmp
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Chris Forbes [Thu, 7 Nov 2013 09:05:01 +0000 (22:05 +1300)]
mesa: Add support to _mesa_bytes_per_vertex_attrib for 10_11_11 format.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Chris Forbes [Thu, 7 Nov 2013 09:02:24 +0000 (22:02 +1300)]
mesa: add varray support for UNSIGNED_INT_10F_11F_11F_REV type
V2: fix interaction with VertexAttribFormat, since that landed after
this was originally written
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Chris Forbes [Thu, 7 Nov 2013 08:23:17 +0000 (21:23 +1300)]
mesa: Add extension scaffolding for ARB_vertex_type_10f_11f_11f_rev
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Matthew McClure [Tue, 29 Oct 2013 20:36:41 +0000 (13:36 -0700)]
draw,llvmpipe,util: add depth bias calculation for arb_depth_buffer_float
With this patch, the llvmpipe and draw modules will calculate the depth bias
according to floating point depth buffer semantics described in the
arb_depth_buffer_float specification, when the driver has a z buffer bound
with a format type of UTIL_FORMAT_TYPE_FLOAT.
By default, the driver will use the existing UNORM calculation for depth bias.
A new function, draw_set_zs_format, was added to calculate the Minimum
Resolvable Depth value and floating point depth sense for the draw module.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Eric Anholt [Tue, 18 Jun 2013 20:52:03 +0000 (13:52 -0700)]
i965: Avoid flushing the batch for every blorp op.
This brings over the batch-wrap-prevention and aperture space checking
code from the normal brw_draw.c path, so that we don't need to flush the
batch every time.
There's a risk here if the intel_emit_post_sync_nonzero_flush() call isn't
high enough up in the state emit sequences -- before, we implicitly had
one at the batch flush before any state was emitted, so Mesa's workaround
emits didn't really matter. Since the SNB fixes by Ken, I didn't see any
regressions after 3 piglit runs.
Improves cairo-gl performance by 13.7733% +/- 1.74876% (n=30/32)
Improves minecraft apitrace performance by 1.03183% +/- 0.482297% (n=90).
Reduces low-resolution GLB 2.7 performance by 1.17553% +/- 0.432263% (n=88)
Reduces Lightsmark performance by 3.70246% +/- 0.322432% (n=126)
No statistically significant performance difference on unigine tropics
(n=10)
No statistically significant performance difference on openarena (n=755)
The two apps that are hurt happen to include stalls on busy buffer
objects, so I think this is an effect of missing out on an opportune
flush.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Matt Turner [Tue, 5 Nov 2013 21:56:14 +0000 (13:56 -0800)]
build: Build gen_matypes and matypes.h from src/mesa.
Reviewed-by: Eric Anholt <eric@anholt.net>
Matt Turner [Tue, 5 Nov 2013 21:53:45 +0000 (13:53 -0800)]
build: Change HAVE_X86_ASM to mean x86 or x86-64 asm.
I want a conditional that says generally "we have x86 assembly" in the
next patch.
Reviewed-by: Eric Anholt <eric@anholt.net>
Matt Turner [Tue, 5 Nov 2013 19:20:12 +0000 (11:20 -0800)]
configure.ac: Test $asm_arch directly.
Reviewed-by: Eric Anholt <eric@anholt.net>
Fredrik Höglund [Tue, 5 Nov 2013 18:35:17 +0000 (19:35 +0100)]
docs: Mark ARB_vertex_attrib_binding as done, update relnotes
Reviewed-by: Eric Anholt <eric@anholt.net>
Fredrik Höglund [Tue, 5 Nov 2013 18:34:16 +0000 (19:34 +0100)]
mesa: Enable ARB_vertex_attrib_binding
Reviewed-by: Eric Anholt <eric@anholt.net>
Fredrik Höglund [Thu, 11 Apr 2013 14:49:44 +0000 (16:49 +0200)]
mesa: Optimize rebinding the same VBO
Check if the new buffer object has the same name as the current
buffer object before looking it up.
Reviewed-by: Eric Anholt <eric@anholt.net>
Fredrik Höglund [Thu, 4 Apr 2013 17:55:50 +0000 (19:55 +0200)]
mesa: Handle zero-stride arrays in _mesa_update_array_max_element()
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Fredrik Höglund [Thu, 4 Apr 2013 20:15:13 +0000 (22:15 +0200)]
mesa: Add Get* support for ARB_vertex_attrib_binding
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Fredrik Höglund [Tue, 9 Apr 2013 18:54:25 +0000 (20:54 +0200)]
mesa: Add ARB_vertex_attrib_binding
update_array() and update_array_format() are changed to update the new
attrib and binding states, and the client arrays become derived state.
Reviewed-by: Eric Anholt <eric@anholt.net>
Fredrik Höglund [Tue, 9 Apr 2013 18:44:58 +0000 (20:44 +0200)]
glapi: Add infrastructure for ARB_vertex_attrib_binding
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Fredrik Höglund [Fri, 1 Nov 2013 18:09:58 +0000 (19:09 +0100)]
mesa: Make handle_bind_buffer_gen() non-static
...and rename it to _mesa_bind_buffer_gen().
This is so the function can be called from _mesa_BindVertexBuffer().
This patch also adds a caller parameter so we can report the right
entry point in error messages.
Based on a patch by Eric Anholt.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Fredrik Höglund [Wed, 3 Apr 2013 20:08:47 +0000 (22:08 +0200)]
mesa: Rename gl_array_object::VertexAttrib to _VertexAttrib
This will become derived state as part of the ARB_vertex_attrib_binding
support.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Fredrik Höglund [Wed, 3 Apr 2013 19:47:44 +0000 (21:47 +0200)]
mesa: Split out the format code from update_array()
Split out the code for updating the array format into a new function
called update_array_format(). This function will be called by both
update_array() and the new glVertexAttrib*Format() entry points in
ARB_vertex_attrib_binding.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Fredrik Höglund [Sat, 11 May 2013 17:23:46 +0000 (19:23 +0200)]
mesa: Restore gl_array_object::NewArray
This will be used by the ARB_vertex_attrib_binding implementation.
This reverts commit
db38e9a0e179441f59274f6f2a751912c29872e2.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Kenneth Graunke [Wed, 6 Nov 2013 08:33:14 +0000 (00:33 -0800)]
i965: Use has_surface_tile_offset in depth/stencil alignment workaround.
Currently, has_surface_tile_offset is equivalent to gen == 4 && !is_g4x.
We already use it for related checks in brw_wm_surface_state.c, so it
makes sense to use it here too. It's simpler and more future-proof.
Broadwell also lacks surface tile offsets. With this patch, I won't
need to update any generation checking; I can simply not set the flag.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Fabio Pedretti [Wed, 6 Nov 2013 09:55:28 +0000 (10:55 +0100)]
gallium: fix build on GNU/kFreeBSD
Patch from Debian package
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Fabio Pedretti [Wed, 6 Nov 2013 09:55:27 +0000 (10:55 +0100)]
configure.ac: fix build on GNU/kFreeBSD
Based on existing patch from Debian package.
Debian bug: http://bugs.debian.org/524690
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Fabio Pedretti [Tue, 5 Nov 2013 15:51:19 +0000 (16:51 +0100)]
mesa: add arm64 support
Patch from Ubuntu package
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Fabio Pedretti [Tue, 5 Nov 2013 11:49:56 +0000 (12:49 +0100)]
r600/compute: silence unused var warning
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Paul Berry [Tue, 5 Nov 2013 02:48:17 +0000 (18:48 -0800)]
i965/gen6: Don't allow SIMD16 dispatch in 4x PERPIXEL mode with computed depth.
Hardware docs say we can only use SIMD8 dispatch in this condition.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Matt Turner [Tue, 5 Nov 2013 19:21:53 +0000 (11:21 -0800)]
configure.ac: Drop no-out-of-tree notice.
We do support out of tree builds now.
Tested-by: Colin Walters <walters@verbum.org>
Matt Turner [Mon, 4 Nov 2013 22:52:22 +0000 (14:52 -0800)]
mesa: Build program as part of libmesa.
Matt Turner [Mon, 4 Nov 2013 22:36:53 +0000 (14:36 -0800)]
mesa: Clean up use of top_srcdir/top_builddir.
Matt Turner [Tue, 5 Nov 2013 00:26:29 +0000 (16:26 -0800)]
i965: Use unreachable() to silence a compiler warning.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Matt Turner [Tue, 5 Nov 2013 00:24:35 +0000 (16:24 -0800)]
mesa: Add unreachable() macro.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Roland Scheidegger [Wed, 6 Nov 2013 14:40:25 +0000 (15:40 +0100)]
gallivm: fix indirect addressing of inputs
We weren't adding the soa offsets when constructing the indices
for the gather functions. That meant that we were always returning
the data in the first element.
(Copied straight from the same fix for temps.)
While here fix up a couple of broken comments in the fetch functions,
plus don't name a straight float type float4 which is just confusing.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
Vincent Lejeune [Mon, 21 Oct 2013 19:05:57 +0000 (21:05 +0200)]
r600/llvm: Fix isampleBuffer on preEG
Vincent Lejeune [Mon, 21 Oct 2013 16:48:21 +0000 (18:48 +0200)]
r600/llvm: Fix texbuf for pre EG gen
Brian Paul [Tue, 5 Nov 2013 23:58:15 +0000 (16:58 -0700)]
mesa: for GLSL_DUMP_ON_ERROR, also dump the info log
Since it's helpful to know why the shader did not compile.
Also, call fflush() for Windows.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Grigori Goronzy [Tue, 5 Nov 2013 23:35:31 +0000 (00:35 +0100)]
st/vdpau: resolve delayed rendering for GL interop v2
Otherwise OutputSurface interop has funny results sometimes.
This fixes interop with the mpv media player.
v2 (chk): add proper locking
Signed-off-by: Christian König <christian.koenig@amd.com>
Chris Forbes [Wed, 6 Nov 2013 06:35:41 +0000 (19:35 +1300)]
docs: Mark off ARB_sample_shading; minor tidyup.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Chris Forbes [Sat, 26 Oct 2013 23:32:03 +0000 (12:32 +1300)]
i965/fs: Gen4-5: Implement alpha test in shader for MRT
V2: Add comment explaining what emit_alpha_test() is for;
fix spurious temp and bogus whitespace.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Chris Forbes [Sun, 27 Oct 2013 15:18:29 +0000 (04:18 +1300)]
i965/fs: Gen4-5: Setup discard masks for MRT alpha test
The same setup is required here as when the user-provided shader
explicitly uses KIL or discard.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Chris Forbes [Sat, 26 Oct 2013 23:09:51 +0000 (12:09 +1300)]
i965: Gen4-5: Include alpha func/ref in program key
V2: Better explanation of the rationale for doing this.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Chris Forbes [Sat, 26 Oct 2013 23:09:51 +0000 (12:09 +1300)]
i965: Gen4-5: Don't enable hardware alpha test with MRT
We have to do this in the shader instead, since these gens lack an
independent RT0 alpha value in their render target write messages.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Sat, 2 Nov 2013 03:05:27 +0000 (20:05 -0700)]
i965: Combine {brw,gen7}_update_texture_buffer_surface() functions.
Now that brw_update_texture_buffer_surface() uses the virtual
emit_buffer_surface_state() function, it works for Gen7+ too.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Kenneth Graunke [Sat, 2 Nov 2013 00:37:10 +0000 (17:37 -0700)]
i965: Unvirtualize brw_create_constant_surface; delete Gen7+ variant.
Now that brw_create_constant_surface uses a virtual function internally,
it doesn't need to be virtual itself. We can delete the Gen7+ variant
and simplify things.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Kenneth Graunke [Sat, 2 Nov 2013 00:33:42 +0000 (17:33 -0700)]
i965: Use the new emit_buffer_surface_state() vtable entry.
This will allow us to combine the Gen4-6 and Gen7 variants of these
functions.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Kenneth Graunke [Fri, 25 Oct 2013 18:37:06 +0000 (11:37 -0700)]
i965: Virtualize emit_buffer_surface_state().
This entails adding "mocs" and "rw" parameters to the Gen4-5 version.
I made it actually pay attention to the rw flag (even though it is
always false), but mocs is always ignored.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Courtney Goeltzenleuchter [Wed, 30 Oct 2013 21:58:30 +0000 (15:58 -0600)]
i965: Fix compiler warning.
fix: intel_screen.c:1320:4: warning: initialization from
incompatible pointer type [enabled by default]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Sat, 2 Nov 2013 00:43:43 +0000 (17:43 -0700)]
i965: Tell the unit states how many binding table entries we have.
Before the series with
3c9dc2d31b80fc73bffa1f40a91443a53229c8e2 to
dynamically assign our binding table indices, we didn't really track our
binding table count per shader, so we never filled in these fields.
Affects cairo-gl trace runtime by -2.47953% +/- 1.07281% (n=20)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Mon, 4 Nov 2013 23:49:52 +0000 (15:49 -0800)]
i965: Fix context initialization after
2f896627175384fd5
You can't return stack-initialized values and expect anything good to
happen.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com
Reviewed-by: Matt Turner <mattst88@gmail.com>
Roland Scheidegger [Tue, 5 Nov 2013 18:21:25 +0000 (19:21 +0100)]
gallivm: optimize lp_build_minify for sse
SSE can't handle true vector shifts (with variable shift count),
so llvm is turning them into a mess of extracts, scalar shifts and inserts.
It is however possible to emulate them in lp_build_minify with float muls,
which should be way faster (saves over 20 instructions per 8-wide
lp_build_minify). This wouldn't work for "generic" 32bit shifts though
since we've got only 24bits of mantissa (actually for left shifts it would
work by using sse41 int mul instead of float mul but not for right shifts).
Note that this has very limited scope for now, since this is only used with
per-pixel lod (otherwise we're avoiding the non-constant shift count by doing
per-quad shifts manually), and only 1d textures even then (though the latter
should change).
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Ian Romanick [Fri, 1 Nov 2013 21:56:53 +0000 (14:56 -0700)]
nouveau: Use _NEW_SCISSOR instead of hooking through dd_function_table
This will enable removing the dd_function_table::Scissor hook in the
near future.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Ian Romanick [Fri, 1 Nov 2013 21:56:28 +0000 (14:56 -0700)]
nouveau: Use _NEW_VIEWPORT instead of hooking through dd_function_table
This will enable removing the dd_function_table::DepthRange hook in the
near future.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Ian Romanick [Fri, 1 Nov 2013 18:40:44 +0000 (11:40 -0700)]
radeon / r200: Don't pass unused parameters to radeon_viewport
The x, y, width, and height parameters aren't used by radeon_viewport,
so don't pass them. This should make future changes to the
dd_function_table::Viewport interface a little easier.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jljusten@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Courtney Goeltzenleuchter <courtney@lunarg.com>
Ian Romanick [Fri, 1 Nov 2013 18:38:25 +0000 (11:38 -0700)]
i915: Bring sanity to the Viewport function
The i830 and the i915 driver have the same dd_function_table::Viewport
function... it just has two names and lives in two places. Using a
single implementation allows cleaning up the saved_viewport nonsense
too.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jljusten@gmail.com>
Cc: Courtney Goeltzenleuchter <courtney@lunarg.com>
Ian Romanick [Fri, 1 Nov 2013 18:36:47 +0000 (11:36 -0700)]
i965: Eliminate the saved_viewport wrapper
The i965 driver never installed a dd_function_table::Viewport function,
so this wrapper never actually did anything.
No piglit regressions on IVB on DRI2.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jljusten@gmail.com>
Cc: Courtney Goeltzenleuchter <courtney@lunarg.com>
Alexander von Gluck IV [Tue, 5 Nov 2013 01:31:26 +0000 (01:31 +0000)]
mesa: Remove last BEOS checks
* Goodbye BeOS, we hardly knew thee
* As BeOS was gcc2 only, there was little chance
of this being useful.
* Doesn't effect Haiku in any meaningful way
Reviewed-by: Brian Paul <brianp@vmware.com>
José Fonseca [Fri, 25 Oct 2013 11:39:42 +0000 (12:39 +0100)]
util/u_format: take normalized flag in consideration in util_format_is_rgba8_variant
Just happened to notice it was missing while looking at it.
Paul Berry [Thu, 31 Oct 2013 00:01:01 +0000 (17:01 -0700)]
glsl: Don't generate misleading debug names when packing gs inputs.
Previously, when packing geometry shader input varyings like this:
in float foo[3];
in float bar[3];
lower_packed_varyings would declare a packed varying like this:
(declare (shader_in flat) (array ivec4 3) packed:foo[0],bar[0])
That's confusing, since the packed varying acutally stores all three
values of foo and all three values of bar.
This patch causes it to generate the more sensible declaration:
(declare (shader_in flat) (array ivec4 3) packed:foo,bar)
Note that there should be no functional change for users of geometry
shaders, since the packed name is only used for generating debug
output. But this should reduce confusion when using INTEL_DEBUG=gs.
Reviewed-by: Eric Anholt <eric@anholt.net>
Vinson Lee [Mon, 4 Nov 2013 04:27:13 +0000 (20:27 -0800)]
gallivm: Remove llvm::DisablePrettyStackTrace for LLVM >= 3.4.
LLVM 3.4 r193971 removed llvm::DisablePrettyStackTrace and made the
pretty stack trace opt-in rather than opt-out.
The default value of DisablePrettyStackTrace has changed to true in LLVM
3.4 and newer.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60929
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Alexander von Gluck IV [Mon, 4 Nov 2013 18:51:41 +0000 (18:51 +0000)]
target/haiku-softpipe: Fix viewport issues
* Call mesa viewport call on winndow resize
* Add initial postprocessing code
* Pass hgl_context to private statetracker
as it is more useful than GalliumContext
* Use Lock and Unlock functions to standardize
GalliumContext locking
* Create texture resources in texture validation
Acked-by: Brian Paul <brianp@vmware.com>
Brian Paul [Tue, 5 Nov 2013 01:07:37 +0000 (18:07 -0700)]
mesa: remove __alpha__ && CCPML check
Reviewed-by: Matt Turner <mattst88@gmail.com>
Brian Paul [Tue, 5 Nov 2013 00:47:19 +0000 (17:47 -0700)]
mesa: remove OPENSTEP stuff
Reviewed-by: Matt Turner <mattst88@gmail.com>
Brian Paul [Tue, 5 Nov 2013 00:47:19 +0000 (17:47 -0700)]
mesa: remove macintosh preprocessor stuff
IIRC, this is MacOS 9.x stuff.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Brian Paul [Tue, 5 Nov 2013 00:47:19 +0000 (17:47 -0700)]
mesa: remove __QUICKDRAW__ tests
Reviewed-by: Matt Turner <mattst88@gmail.com>
Brian Paul [Tue, 5 Nov 2013 00:47:19 +0000 (17:47 -0700)]
mesa: remove WGLAPI macro
WGLAPI was defined in glheader.h but wasn't used anywhere.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Fri, 1 Nov 2013 20:29:37 +0000 (13:29 -0700)]
i965: Expose brw_reg_from_fs_reg() to other files.
This will be useful for Broadwell code as well.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Kenneth Graunke [Fri, 1 Nov 2013 23:21:01 +0000 (16:21 -0700)]
i965: Combine gen6_clip_state.c and gen7_clip_state.c.
The changes between Gen6-7 are minimal, and can easily be solved with
an extra generation check. This cuts a lot of duplicated code.
It also helps prevent even more duplication for Broadwell.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Francisco Jerez [Mon, 4 Nov 2013 19:58:10 +0000 (11:58 -0800)]
dri/nouveau: Fix nouveau_init_screen2 breakage.
Fix incorrect init ordering in nouveau_init_screen2 caused by
083f66fdd6451648fe355b64b02b29a6a4389f0d.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71172
Francisco Jerez [Fri, 1 Nov 2013 18:29:13 +0000 (11:29 -0700)]
i965/gen7: Add instruction latency estimates for untyped atomics and reads.
The latency information has been obtained empirically from
measurements taken on Haswell and Ivy Bridge.
Acked-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Francisco Jerez [Wed, 25 Sep 2013 23:31:35 +0000 (16:31 -0700)]
i965/gen7: Handle atomic instructions from the VEC4 back-end.
This can deal with all the 15 32-bit untyped atomic operations the
hardware supports, but only INC and PREDEC are going to be exposed
through the API for now.
v2: Represent atomics as GLSL intrinsics. Add support for variably
indexed atomic counter arrays.
v3: Add comment on why we don't need to assign uniform storage for
atomic counters.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Francisco Jerez [Wed, 25 Sep 2013 23:30:20 +0000 (16:30 -0700)]
i965/gen7: Handle atomic instructions from the FS back-end.
This can deal with all the 15 32-bit untyped atomic operations the
hardware supports, but only INC and PREDEC are going to be exposed
through the API for now.
v2: Represent atomics as GLSL intrinsics. Add support for variably
indexed atomic counter arrays. Fix interaction with fragment
discard.
v3: Add comment on why we don't need to assign uniform storage for
atomic counters.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Francisco Jerez [Sun, 20 Oct 2013 21:02:08 +0000 (14:02 -0700)]
i965: Add a 'has_side_effects' back-end instruction predicate.
This patch fixes the three dead code elimination passes and the
VEC4/FS instruction scheduling passes so they leave instructions with
side effects alone.
At some point it might be interesting to have the instruction
scheduler calculate the exact memory dependencies between atomic ops,
but they're rare enough that it seems unlikely that it will make any
practical difference.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Francisco Jerez [Mon, 4 Nov 2013 19:26:13 +0000 (11:26 -0800)]
clover: Calculate optimal work group size when it's not specified by the user.
Inspired by a patch sent to the mailing list by Tom Stellard, but
using a different algorithm to calculate the optimal block size that
has been found to be considerably more effective.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Francisco Jerez [Mon, 4 Nov 2013 19:24:10 +0000 (11:24 -0800)]
clover: Constify some command_queue arguments.
Francisco Jerez [Wed, 30 Oct 2013 18:11:06 +0000 (11:11 -0700)]
clover: Workaround compiler bug present in GCC 4.7.0-4.7.2.
Variadic template aliases make these versions of GCC very confused,
write down the full type spec instead.
Emil Velikov [Fri, 1 Nov 2013 16:44:10 +0000 (16:44 +0000)]
st/xorg: handle updates to DamageUnregister API
xserver 1.14.99.2 simplified the DamageUnregister API, by
dropping the drawable argument.
Follow xf86-video-intel and xf86-video-vmware approach and
handle the new API by checking XORG_VERSION_CURRENT.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71110
Reported-by: Michał Górny <mgorny@gentoo.org>
Reported-by: Vinson Lee <vlee@freedesktop.org>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Brian Paul [Mon, 4 Nov 2013 14:33:41 +0000 (07:33 -0700)]
mesa: remove Watcom C support
Reviewed-by: Eric Anholt <eric@anholt.net>
Brian Paul [Mon, 4 Nov 2013 14:26:54 +0000 (07:26 -0700)]
mesa: remove Centerline C support from gl.h
Reviewed-by: Eric Anholt <eric@anholt.net>
Brian Paul [Mon, 4 Nov 2013 14:29:57 +0000 (07:29 -0700)]
mesa: remove BUILD_FOR_SNAP bits
Reviewed-by: Eric Anholt <eric@anholt.net>
Brian Paul [Mon, 4 Nov 2013 14:25:22 +0000 (07:25 -0700)]
mesa: remove SciTech stuff from gl.h
Reviewed-by: Eric Anholt <eric@anholt.net>
Marek Olšák [Sun, 3 Nov 2013 19:27:28 +0000 (20:27 +0100)]
r600g: properly unbind a DSA state being deleted in r600_delete_dsa_state
Tested-by: Christian König <christian.koenig@amd.com>
Marek Olšák [Thu, 31 Oct 2013 14:49:36 +0000 (15:49 +0100)]
docs/GL3: document radeonsi support, minor cleanup
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>