Nicolai Hähnle [Thu, 12 Nov 2015 14:09:21 +0000 (15:09 +0100)]
st/mesa: add support for batch driver queries to perfmon
v2 + v3: forgot null-pointer checks (spotted by Samuel Pitoiset)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Nicolai Hähnle [Tue, 10 Nov 2015 16:04:32 +0000 (17:04 +0100)]
gallium/hud: add support for batch queries
v2 + v3: be more defensive about allocations
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Nicolai Hähnle [Tue, 10 Nov 2015 13:06:59 +0000 (14:06 +0100)]
gallium: add the concept of batch queries
Some drivers (in particular radeon[si], but also freedreno judging from
a quick grep) may want to expose performance counters that cannot be
individually enabled or disabled.
Allow such drivers to mark driver-specific queries as requiring a new
type of batch query object that is used to start and stop a list of queries
simultaneously.
v3: adjust recently added nv50 queries
v2: documentation for create_batch_query
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Nicolai Hähnle [Thu, 12 Nov 2015 11:30:23 +0000 (12:30 +0100)]
st/mesa: maintain active perfmon counters in an array
It is easy enough to pre-determine the required size, and arrays are
generally better behaved especially when they get large.
v2: make sure init_perf_monitor returns true when no counters are active
(spotted by Samuel Pitoiset)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Nicolai Hähnle [Thu, 12 Nov 2015 11:02:44 +0000 (12:02 +0100)]
st/mesa: use BITSET_FOREACH_SET to loop through active perfmon counters
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Nicolai Hähnle [Thu, 12 Nov 2015 10:53:22 +0000 (11:53 +0100)]
st/mesa: store mapping from perfmon counter to query type
Previously, when a performance monitor was initialized, an inner loop through
all driver queries with string comparisons for each enabled performance
monitor counter was used. This hurts when a driver exposes lots of queries.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Nicolai Hähnle [Fri, 6 Nov 2015 13:19:54 +0000 (14:19 +0100)]
st/mesa: map semantic driver query types to underlying type
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Nicolai Hähnle [Tue, 10 Nov 2015 13:41:52 +0000 (14:41 +0100)]
gallium/hud: remove unused field in query_info
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Nicolai Hähnle [Tue, 10 Nov 2015 12:35:01 +0000 (13:35 +0100)]
gallium: remove pipe_driver_query_group_info field type
This was only used to implement an unnecessarily restrictive interpretation
of the spec of AMD_performance_monitor. The spec says
A performance monitor consists of a number of hardware and software
counters that can be sampled by the GPU and reported back to the
application.
I guess one could take this as a requirement that counters _must_ be sampled
by the GPU, but then why are they called _software_ counters? Besides,
there's not much reason _not_ to expose all counters that are available,
and this simplifies the code.
v3: add a missing change in the nouveau driver (thanks Samuel Pitoiset)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Roland Scheidegger [Fri, 20 Nov 2015 03:49:23 +0000 (04:49 +0100)]
gallivm: use sampler index 0 for texel fetches
texel fetches don't use any samplers. Previously we just set the same
number for both texture and sampler unit (as per "ordinary" gl style
sampling where the numbers are always the same) however this would trigger
some assertions checking that the sampler index isn't over PIPE_MAX_SAMPLERS
limit elsewhere with d3d10, so just set to 0.
(Fixing the assertion instead isn't really an option, the sampler isn't
really used but might still pass an out-of-bound pointer around and even
copy some things from it.)
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Ilia Mirkin [Fri, 20 Nov 2015 00:17:04 +0000 (19:17 -0500)]
freedreno/a4xx: add BPTC support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
François Tigeot [Tue, 17 Nov 2015 17:54:01 +0000 (18:54 +0100)]
xmlconfig: Add support for DragonFly
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Mauro Rossi [Sat, 7 Nov 2015 00:23:46 +0000 (01:23 +0100)]
android: export the path of glsl nir headers
The change is necessary to avoid building errors in glsl and i965
modules due to missing glsl_types.h header
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Boyan Ding [Fri, 20 Nov 2015 11:11:19 +0000 (11:11 +0000)]
mesa: re-enable KHR_debug for ES contexts
With the earlier issues resolved we can expose the extension.
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Boyan Ding [Sun, 8 Nov 2015 09:56:40 +0000 (17:56 +0800)]
main: Don't restrict several KHR_debug enum to desktop GL
In preparation for supporting GL_KHR_debug in OpenGL ES
v2: add a missing hunk in _mesa_IsEnabled (Emil)
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Thu, 5 Nov 2015 20:22:25 +0000 (20:22 +0000)]
mesa: use the correct string for the ES GL_KHR_debug functions
As defined in the spec
when implemented in an OpenGL ES context, all entry points defined
by this extension must have a "KHR" suffix.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Gregory Hainaut [Sun, 25 Oct 2015 14:01:36 +0000 (15:01 +0100)]
glsl: avoid linker and user varying location to overlap
Current behavior on the interface matching:
layout (location = 0) out0; // Assigned to VARYING_SLOT_VAR0 by user
out1; // Assigned to VARYING_SLOT_VAR0 by the linker
New behavior on the interface matching:
layout (location = 0) out0; // Assigned to VARYING_SLOT_VAR0 by user
out1; // Assigned to VARYING_SLOT_VAR1 by the linker
v4:
* Fix variable name in assert
Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Emil Velikov [Fri, 6 Nov 2015 23:39:01 +0000 (23:39 +0000)]
auxiliary/vl/dri2: coding style fixes
Rewrap long(ish) lines, add space between struct foo and *.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Emil Velikov [Mon, 9 Nov 2015 11:25:59 +0000 (11:25 +0000)]
auxiliary/vl/dri2: hide internal functions
Analogous to previous commit. While we're here prefix all functions
identically -> vl_dri2_foo
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Emil Velikov [Mon, 9 Nov 2015 11:24:35 +0000 (11:24 +0000)]
auxiliary/vl/drm: hide internal functions
As of last commit everyone is using the vl_screen dispatch, thus we can
hide this function from the headers and make it static.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Emil Velikov [Fri, 6 Nov 2015 23:12:13 +0000 (23:12 +0000)]
st/vdpau: use the vl_screen dispatch
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Emil Velikov [Fri, 6 Nov 2015 23:02:14 +0000 (23:02 +0000)]
st/xvmc: use the vl_screen dispatch
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Emil Velikov [Mon, 9 Nov 2015 11:23:37 +0000 (11:23 +0000)]
st/va: use the vl_screen dispatch
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Emil Velikov [Fri, 6 Nov 2015 22:45:38 +0000 (22:45 +0000)]
st/omx: use the vl_screen dispatch
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Emil Velikov [Fri, 6 Nov 2015 22:40:34 +0000 (22:40 +0000)]
auxiliary/vl/dri2: setup the dispatch
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Emil Velikov [Mon, 9 Nov 2015 11:34:48 +0000 (11:34 +0000)]
auxiliary/vl/drm: use a label for the error path
... just like every other place in gallium.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Emil Velikov [Mon, 9 Nov 2015 11:18:14 +0000 (11:18 +0000)]
auxiliary/vl/drm: setup the dispatch
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Emil Velikov [Fri, 6 Nov 2015 22:34:01 +0000 (22:34 +0000)]
auxiliary/vl: add dispatch table
As mentioned previously, it will allow us to use different vl backend in
a generic way from either video state-tracker.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Emil Velikov [Mon, 9 Nov 2015 11:17:07 +0000 (11:17 +0000)]
auxiliary/vl: rename vl_screen_create to vl_dri2_screen_create
In a preparation of having proper multi-platform/backend handling in VL.
With follow up commits we'll introduce a dispatch within vl_screen
similar to the one in pipe_screen. This way any VL state-tracker can
operate seamlessly, considering the backend/platform is properly setup.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Emil Velikov [Mon, 9 Nov 2015 11:14:56 +0000 (11:14 +0000)]
st/va: trivial cleanup
Drop the temporary variable and fold the two conditional.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Emil Velikov [Mon, 9 Nov 2015 11:03:01 +0000 (11:03 +0000)]
st/omx: straighten get/put_screen
The current code is busted in a number of ways.
- initially checks for omx_display (rather than omx_screen), which may
or may not be around.
- blindly feeds the empty env variable string to loader_open_device()
- reads the env variable every time get_screen is called
- the latter manifests into memory leaks, and other issues as one sets
the variable between two get_screen calls.
Additionally it cleans up a couple of extra bits
- drops unneeded set/check of omx_display.
- make the teardown (put_screen) order was not symmetrical to the setup
(get_screen)
v2: Drop the "is empty string" check (Leo)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Emil Velikov [Thu, 19 Nov 2015 15:36:03 +0000 (15:36 +0000)]
automake: loader: don't create an empty dri3 helper
Seems that creating an empty one does not fair too well with MacOSX's
ar. Considering that all the users of the helper include it only when
needed, let's reshuffle the makefile.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92985
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Emil Velikov [Thu, 19 Nov 2015 15:34:20 +0000 (15:34 +0000)]
automake: loader: honour the XCB_DRI3 cflags
Without this the compilation will fail, as the headers are installed in
a non-default location.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Emil Velikov [Thu, 19 Nov 2015 15:50:50 +0000 (15:50 +0000)]
automake: egl: add symbols test
Should help us catch issues where we expose any extra symbols by
mistake. Just like the ones fixes with previous commit.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Acked-by: Matt Turner <mattst88@gmail.com>
Emil Velikov [Thu, 19 Nov 2015 15:31:06 +0000 (15:31 +0000)]
automake: loader: rework the CPPFLAGS
Rather than duplicating things, just use the generic AM_CPPFLAGS. This
has the fortunate side-effect of adding VISIBILITY_CFLAGS for the dri3
helper. The latter of which was erroneously exposing some internal
symbols.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reported-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Ian Romanick [Wed, 18 Nov 2015 01:57:08 +0000 (17:57 -0800)]
i965: Enable EXT_shader_samples_identical
On the vec4 backend, textureSamplesIdentical() will always return
false. There are currently no test cases for the vec4 backend, so we
don't have much confidence in any implementation. We also don't think
anyone is likely to miss it.
v2: Handle immediate value for MCS smarter. Rebase on changes to
nir_texop_sampels_identical (missing second parameter). Suggested by
Jason.
v3: Add Neil's code to handle 16x MSAA in the FS. Also rebase on top of
f9a9ba5e. Stub out the vec4 implementation.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Neil Roberts <neil@linux.intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> [v2]
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> [v2]
Ian Romanick [Wed, 18 Nov 2015 03:31:39 +0000 (19:31 -0800)]
i965/vec4: Handle nir_tex_src_ms_index more like the scalar
v2: Rebase on top of
f9a9ba5e.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Ian Romanick [Wed, 18 Nov 2015 01:09:09 +0000 (17:09 -0800)]
nir: Add nir_texop_samples_identical opcode
This is the NIR analog to GLSL IR ir_samples_identical.
v2: Don't add the second nir_tex_src_ms_index parameter. Suggested by
Ken and Jason.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Ian Romanick [Wed, 18 Nov 2015 00:59:40 +0000 (16:59 -0800)]
glsl: Add textureSamplesIdenticalEXT built-in functions
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Ian Romanick [Wed, 18 Nov 2015 00:54:31 +0000 (16:54 -0800)]
glsl: Add ir_samples_identical opcode
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Ian Romanick [Tue, 17 Nov 2015 23:36:15 +0000 (15:36 -0800)]
glsl: Extension tracking for EXT_shader_samples_indentical
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Ian Romanick [Tue, 17 Nov 2015 23:32:10 +0000 (15:32 -0800)]
mesa: Extension tracking for EXT_shader_samples_indentical
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Ian Romanick [Tue, 17 Nov 2015 23:26:27 +0000 (15:26 -0800)]
Import current draft of EXT_shader_samples_identical spec
v2: Add Neil to the list of contributors. I meant to do that before,
but Matt reminded me.
v3: Fix typos noticed by Nicolai.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Rob Clark [Thu, 5 Nov 2015 15:23:48 +0000 (10:23 -0500)]
nir: add nir_ssa_for_alu_src()
Using something like:
numer = nir_ssa_for_src(bld, alu->src[0].src,
nir_ssa_alu_instr_src_components(alu, 0));
for alu src's with swizzle, like:
vec1 ssa_10 = intrinsic load_uniform () () (0, 0)
vec2 ssa_11 = intrinsic load_uniform () () (1, 0)
vec2 ssa_2 = udiv ssa_10.xx, ssa_11
ends up turning into something like:
vec1 ssa_10 = intrinsic load_uniform () () (0, 0)
vec2 ssa_11 = intrinsic load_uniform () () (1, 0)
vec2 ssa_13 = imov ssa_10
...
because nir_ssa_for_src() ignore's the original nir_alu_src's swizzle.
Instead for alu instructions, nir_src_for_alu_src() should be used to
ensure the original alu src's swizzle doesn't get lost in translation:
vec1 ssa_10 = intrinsic load_uniform () () (0, 0)
vec2 ssa_11 = intrinsic load_uniform () () (1, 0)
vec2 ssa_13 = imov ssa_10.xx
...
v2: check for abs/neg, and re-use existing nir_alu_src
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Rob Clark [Wed, 4 Nov 2015 21:10:52 +0000 (16:10 -0500)]
nir: fix missing increments of num_inputs/num_outputs
Note: not quite perfect, we should use type_size vfunc (in
compiler_options or nir_shader?) to determine how much we
increment num_inputs/outputs/uniforms. But we don't have
that yet, so let's at least fix things for the existing
users of these passes.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
Rob Clark [Wed, 4 Nov 2015 15:05:32 +0000 (10:05 -0500)]
nir/print: show # of uniforms/inputs/outputs
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Mon, 26 Oct 2015 17:29:45 +0000 (13:29 -0400)]
nir/print: show shader name/label if set
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Rob Clark [Mon, 19 Oct 2015 15:57:51 +0000 (11:57 -0400)]
nir: add nir_var_all enum
Otherwise, passing -1 gets you:
error: invalid conversion from 'int' to 'nir_variable_mode' [-fpermissive]
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Ilia Mirkin [Thu, 19 Nov 2015 06:37:14 +0000 (01:37 -0500)]
freedreno/a4xx: fix 5_5_5_1 texture sampler format
This fixes teximage-colors, fbo-generatemipmap-formats, and probably
others (in relation to the RGB5 formats, others still fail).
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Ilia Mirkin [Thu, 19 Nov 2015 05:32:39 +0000 (00:32 -0500)]
freedreno/a4xx: add depth clamp and halfz clip
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Thu, 19 Nov 2015 05:06:46 +0000 (00:06 -0500)]
freedreno/a4xx: allow seamless cubemap filtering to be enabled per-texture
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Thu, 19 Nov 2015 04:54:25 +0000 (23:54 -0500)]
freedreno/a4xx: support lod_bias
The lower layers assume that we support this, and it's been core since
GL 1.4. This fixes a slew of piglit tests, especially around
tex-miplevel-selection.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Samuel Pitoiset [Thu, 19 Nov 2015 08:51:03 +0000 (09:51 +0100)]
nv50: allow using inline vertex data submit when gl_VertexID is used
The hardware can actually generates vertexid when vertices come from
a client-side buffer like when glDrawElements is used.
This doesn't fix (or break) any piglit tests but it improves the
previous attempt of Ilia (
c830d19 "nv50: avoid using inline vertex
data submit when gl_VertexID is used")
The only disadvantage is that only works on G84+, but we don't really
care of that weird and old NV50 chipset.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Samuel Pitoiset [Thu, 19 Nov 2015 08:51:02 +0000 (09:51 +0100)]
nv50: add NV84_3D macro
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Matt Turner [Mon, 2 Nov 2015 20:25:24 +0000 (12:25 -0800)]
i965: Drop IMM fs_reg/src_reg -> brw_reg conversions.
The previous two commits make this unnecessary.
Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Mon, 2 Nov 2015 20:12:44 +0000 (12:12 -0800)]
i965/vec4: Replace src_reg(imm) constructors with brw_imm_*().
Cuts 1.5k of .text.
Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Mon, 2 Nov 2015 19:28:35 +0000 (11:28 -0800)]
i965/fs: Use brw_imm_uw().
W/UW immediates are 16-bits, but those 16-bits must be replicated
in the high 16-bits of the 32-bit field.
Remove the useless W/UW immediate saturating code, since we'll now be
using the appropriate immediate (and W/UW immediates in the IR can now
no longer be larger than 16-bits).
Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Mon, 2 Nov 2015 19:26:16 +0000 (11:26 -0800)]
i965/fs: Replace fs_reg(imm) constructors with brw_imm_*().
Cuts 10k of .text, of which only 776 bytes are the fs_reg constructor
implementations themselves.
text data bss dec hex filename
5204535 214112 27784
5446431 531b1f i965_dri.so before
5193977 214112 27784
5435873 52f1e1 i965_dri.so after
Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Mon, 2 Nov 2015 18:29:45 +0000 (10:29 -0800)]
i965: Make brw_imm_vf4() take 8-bit restricted floats.
This partially reverts commit
bbf8239f92ecd79431dfa41402e1c85318e7267f.
I didn't like that commit to begin with -- computing things at compile
time is fine -- but for purposes of verifying that the resulting values
are correct, looking up 0x00 and 0x30 in a table is a lot better than
evaluating a recursive function.
Anyway, by making brw_imm_vf4() take the actual 8-bit restricted floats
directly (instead of only integral values that would be converted to
restricted float), we can use this function as a replacement for the
vector float src_reg/fs_reg constructors.
brw_float_to_vf() is not currently an inline function, so it will not be
evaluated at compile time. I'll address that in a follow-up patch.
Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Nanley Chery [Wed, 18 Nov 2015 23:01:44 +0000 (15:01 -0800)]
mesa: Add test for sorted extension table
Enable developers to know if the table's alphabetical sorting
is maintained or lost.
v2: Move "*" next to pointer name (Matt)
Include extensions_table.h instead of extensions.h (Ian)
Remove extra " *" in comment (Ian)
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Nanley Chery [Wed, 18 Nov 2015 23:01:43 +0000 (15:01 -0800)]
mesa/extensions: Sort the extension table alphabetically
Make it easier to determine where to add new extensions.
Performed with the vim sort command.
v2: Insert newline after last #define (Matt)
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Ilia Mirkin [Thu, 19 Nov 2015 17:25:53 +0000 (12:25 -0500)]
docs: GL3.1 for a3xx and a4xx
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ryan Houdek [Thu, 5 Nov 2015 17:07:08 +0000 (11:07 -0600)]
mesa: enable EXT_blend_func_extended if the driver supports the ARB version
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ryan Houdek [Thu, 5 Nov 2015 17:05:58 +0000 (11:05 -0600)]
mesa: allow MAX_DUAL_SOURCE_DRAW_BUFFERS to be available to ES
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ryan Houdek [Thu, 5 Nov 2015 17:05:17 +0000 (11:05 -0600)]
mesa: enable usage of blend_func_extended blend factors in GLES2
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ryan Houdek [Thu, 5 Nov 2015 17:03:44 +0000 (11:03 -0600)]
glsl: add a parse check to check for the index layout qualifier
This can only be used if EXT_blend_func_extended is enabled
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ryan Houdek [Fri, 6 Nov 2015 02:44:03 +0000 (20:44 -0600)]
glsl: add GL_EXT_blend_func_extended preprocessor define
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ryan Houdek [Thu, 5 Nov 2015 16:59:32 +0000 (10:59 -0600)]
glsl: add support for EXT_blend_func_extended builtins
gl_MaxDualSourceDrawBuffersEXT - Maximum dual-source draw buffers supported
For ESSL 1.0, it provides two builtins since you can't have user-defined
color output variables:
gl_SecondaryFragColorEXT
gl_SecondaryFragDataEXT[MaxDSDrawBuffers]
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ryan Houdek [Thu, 5 Nov 2015 16:53:40 +0000 (10:53 -0600)]
glsl: add EXT_blend_func_extended parser enables
This adds a state for the maximum dual source draw variables available
and the variable for determining if the extension has been enabled
in the program shaders.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ryan Houdek [Thu, 5 Nov 2015 16:52:35 +0000 (10:52 -0600)]
glapi: add EXT_blend_func_extended XML definitions
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Brian Paul [Wed, 18 Nov 2015 16:25:48 +0000 (09:25 -0700)]
os: check for GALLIUM_PROCESS_NAME to override os_get_process_name()
Useful for debugging and for glretrace.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Connor Abbott [Fri, 14 Aug 2015 18:58:45 +0000 (11:58 -0700)]
glsl: fix ir_constant::equals() for doubles
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Connor Abbott [Fri, 14 Aug 2015 18:58:07 +0000 (11:58 -0700)]
glsl: fix isinf() for doubles
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Connor Abbott [Tue, 4 Aug 2015 21:04:34 +0000 (14:04 -0700)]
nir: fix constant folding of bfi
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Brian Paul [Thu, 19 Nov 2015 00:08:39 +0000 (17:08 -0700)]
hud: fix Windows build break
Protect signal-related code with PIPE_OS_UNIX test.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Ian Romanick [Wed, 18 Nov 2015 02:35:00 +0000 (18:35 -0800)]
glsl: Fix off-by-one error in array size check assertion
Apparently, this has been a bug since 2010 (
c30f6e5d).
Also use ARRAY_SIZE instead of open coding it.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
Ian Romanick [Tue, 17 Nov 2015 23:27:59 +0000 (15:27 -0800)]
mesa: Don't expose GL_EXT_shader_integer_mix in GLES 1.x
There are no shaders, so it doesn't even make sense to expose the
extension.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Nanley Chery <nanley.g.chery@intel.com>
Ian Romanick [Wed, 18 Nov 2015 00:58:02 +0000 (16:58 -0800)]
glsl: Silence unused parameter warnings
builtin_functions.cpp:5289:52: warning: unused parameter 'num_arguments' [-Wunused-parameter]
unsigned num_arguments,
^
builtin_functions.cpp:5290:52: warning: unused parameter 'flags' [-Wunused-parameter]
unsigned flags)
^
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ian Romanick [Wed, 18 Nov 2015 00:25:06 +0000 (16:25 -0800)]
glsl: Silence ignored qualifier warning
I think the intention was to mark the "this" parameter as const, but
const goes on the other end to do that.
In file included from glsl_symbol_table.cpp:26:0:
ast.h:339:35: warning: type qualifiers ignored on function return type [-Wignored-qualifiers]
const bool is_single_dimension()
^
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Kenneth Graunke [Sun, 8 Nov 2015 02:58:59 +0000 (18:58 -0800)]
i965: Allow indirect GS input indexing in the scalar backend.
This allows arbitrary non-constant indices on GS input arrays,
both for the vertex index, and any array offsets beyond that.
All indirects are handled via the pull model. We could potentially
handle indirect addressing of pushed data as well, but it would add
additional code complexity, and we usually have to pull inputs anyway
due to the sheer volume of input data. Plus, marking pushed inputs
as live due to indirect addressing could exacerbate register pressure
problems pretty badly. We'd need to be careful.
v2: Use updated MOV_INDIRECT opcode.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Jimmy Berry [Wed, 4 Nov 2015 05:24:47 +0000 (23:24 -0600)]
gallium/hud: document GALLIUM_HUD_PERIOD in envvars.html.
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Jimmy Berry [Tue, 10 Nov 2015 05:20:37 +0000 (23:20 -0600)]
gallium/hud: control visibility at startup and runtime.
- env GALLIUM_HUD_VISIBLE: control default visibility
- env GALLIUM_HUD_SIGNAL_TOGGLE: toggle visibility via signal
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Jason Ekstrand [Mon, 16 Nov 2015 19:48:05 +0000 (11:48 -0800)]
i965/nir: Add hooks for testing nir_shader_clone
This commit adds code for testing nir_shader_clone by running it after each
and every optimization pass and throwing away the old shader. Testing
nir_shader_clone is hidden behind a new INTEL_CLONE_NIR environment
variable.
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Wed, 11 Nov 2015 16:31:29 +0000 (08:31 -0800)]
nir: Add support for cloning shaders
This commit is heavily based on one by Rob Clark <robdclark@gmail.com> but
reworked to re-use nir_create functions and do less hashing.
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Kenneth Graunke [Tue, 3 Nov 2015 08:31:22 +0000 (00:31 -0800)]
i965/nir: Validate that NIR passes call nir_metadata_preserve().
Failing to call nir_metadata_preserve() can have nasty consequences:
some pass breaks dominance information, but leaves it marked as valid,
causing some subsequent pass to go haywire and probably crash.
This pass adds a simple validation mechanism to ensure passes handle
this properly. We add a new bogus metadata flag that isn't used for
anything in particular, set it before each pass, and ensure it *isn't*
still set after the pass. nir_metadata_preserve will reset the flag,
so correct passes will work, and bad passes will assert fail.
(I would have made these functions static inline, but nir.h is included
in C++, so we can't bit-or enums without lots of casting...)
Thanks to Dylan Baker for the idea.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Kenneth Graunke [Tue, 3 Nov 2015 08:31:15 +0000 (00:31 -0800)]
i965/nir: Add OPT() and OPT_V() macros for invoking NIR passes.
OPT() is the normal macro for passes that return booleans, while OPT_V()
is a variant that works for passes that don't properly report progress.
(Such passes should be fixed to return a boolean, eventually.)
These macros take care of calling nir_validate_shader() and setting
progress appropriately. In the future, it would be easy to add shader
dumping similar to INTEL_DEBUG=optimizer by extending the macro.
v2 (Jason Ekstrand):
- Fix an unused variable warning
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Rob Clark [Fri, 6 Nov 2015 16:35:21 +0000 (11:35 -0500)]
nir: add array length field
This will simplify things somewhat in clone.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Rob Clark [Fri, 6 Nov 2015 16:35:20 +0000 (11:35 -0500)]
nir: remove nir_variable::max_ifc_array_access
No users.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Rob Clark [Tue, 17 Nov 2015 17:35:09 +0000 (12:35 -0500)]
freedreno/a4xx: add fake RGTC support (required for GL3)
The a4xx bits corresponding to 'freedreno/a3xx: add fake RGTC support
(required for GL3)'
TODO some more r/e.. maybe we get lucky and hw supports some of this
directly? For now this will help us enable gl3.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Tue, 17 Nov 2015 16:42:53 +0000 (11:42 -0500)]
freedreno/a4xx: add compressed texture formats
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Tue, 17 Nov 2015 16:42:34 +0000 (11:42 -0500)]
freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Ilia Mirkin [Sun, 8 Nov 2015 05:28:34 +0000 (00:28 -0500)]
freedreno: expose GLSL 140 and fake MSAA for GL3.0/3.1 support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Ilia Mirkin [Thu, 17 Sep 2015 05:43:36 +0000 (01:43 -0400)]
freedreno/a3xx: fix texture buffers, enable offsets
The main issue is that the current logic looked into cso->u.tex, which
is the wrong side of the union to look into for texture buffers. While I
was at it, it was easy enough to add the logic to handle offsets
(first_element).
- reduce texture buffer size limit (determined experimentally)
- don't look at first/last levels, instead look at first/last element
- include the first element offset
- set offset alignment to 16 (determined experimentally)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Ilia Mirkin [Sun, 8 Nov 2015 04:20:31 +0000 (23:20 -0500)]
freedreno: add support for conditional rendering, required for GL3.0
A smarter implementation would make it possible to attach this to emit
state for the BY_REGION versions to avoid breaking the tiling. But this
is a start.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Ilia Mirkin [Sun, 8 Nov 2015 03:13:16 +0000 (22:13 -0500)]
freedreno/a3xx: add fake RGTC support (required for GL3)
Also throw in LATC while we're at it (same exact format). This could be
made more efficient by keeping a shadow compressed texture to use for
returning at map time. However... it's not worth it for now...
presumably compressed textures are not updated often.
Lastly fix up Z32S8 transfers to non-0 layers.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Ilia Mirkin [Sun, 8 Nov 2015 00:32:32 +0000 (19:32 -0500)]
freedreno/a3xx: add missing formats to enable ARB_vertex_type_2_10_10_10_rev
The previously RE'd formats were from an ES driver implementing
OES_vertex_type_10_10_10_2 and thus backwards. A future change could add
the 2_10_10_10 support.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Mon, 16 Nov 2015 20:07:29 +0000 (15:07 -0500)]
freedreno/a3xx+a4xx: fix for stk binning pass hang
We'd end up in a state where shader uses no inputs, yet num_elements is
greater than zero. Triggered by a TF vertex shader which did:
gl_Position = vec4(0.0, 0.0, 0.0, 0.0);
resulting in a binning pass variant with no inputs.
Includes equiv fix in a4xx, even though we don't have binning-pass
enabled yet on a4xx.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Mon, 16 Nov 2015 19:58:50 +0000 (14:58 -0500)]
freedreno/a3xx+a4xx: fix GL_POINTS lockup w/ GLES
point_size_per_vertex is always TRUE for GLES, causing us to configure
the hw as if gl_PointSize was written, even if it was not. Which makes
for grumpy hw.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Ilia Mirkin [Sun, 8 Nov 2015 18:43:07 +0000 (13:43 -0500)]
nir: fix typo in idiv lowering, causing large-udiv-udiv failures
In nv50, and in the python script that Rob circulated, we do:
bld.mkCmp(OP_SET, CC_GE, TYPE_U32, (s = bld.getSSA()), TYPE_U32, m, b);
Do the same in the nir div lowering pass. This fixes the large-udiv-udiv
piglit tests on freedreno.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Oded Gabbay [Tue, 17 Nov 2015 14:16:46 +0000 (16:16 +0200)]
llvmpipe: disable VSX in ppc due to LLVM PPC bug
This patch disables the use of VSX instructions, as they cause some
piglit tests to fail
For more details, see: https://llvm.org/bugs/show_bug.cgi?id=25503#c7
With this patch, ppc64le reaches parity with x86-64 as far as piglit test
suite is concerned.
v2:
- Added check that we have at least LLVM 3.4
- Added the LLVM bug URL as a comment in the code
v3:
- Only disable VSX if Altivec is supported, because if Altivec support
is missing, then VSX support doesn't exist anyway.
- Change original patch description.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>