Ilia Mirkin [Sun, 27 Sep 2015 05:23:38 +0000 (01:23 -0400)]
tgsi: update atomic op docs
Specify that the operation only applies to the x component, not
per-component as previously specified. This is unnecessary for GL and
creates additional complications for images which need to support these
operations as well.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Ilia Mirkin [Sat, 26 Sep 2015 21:35:41 +0000 (17:35 -0400)]
tgsi: add a is_store property
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Ilia Mirkin [Sat, 7 Nov 2015 07:25:20 +0000 (02:25 -0500)]
tgsi: provide a way to encode memory qualifiers for SSBO
Each load/store on most hardware can specify what caching to do. Since
SSBO allows individual variables to also have separate caching modes,
allow loads/stores to have the qualifiers instead of attempting to
encode them in declarations.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Ilia Mirkin [Sat, 19 Sep 2015 22:19:13 +0000 (18:19 -0400)]
ureg: add buffer support to ureg
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Ilia Mirkin [Sat, 20 Sep 2014 06:54:16 +0000 (02:54 -0400)]
tgsi: add ureg support for image decls
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Jose Fonseca [Fri, 8 Jan 2016 14:03:38 +0000 (14:03 +0000)]
glsl: Ensure 64bits shift is used.
I believe that `1u << x`, where x >= 32 yields undefined results
according to the C standard.
Particularly MSVC says `warning C4334: '<<' : result of 32-bit shift
implicitly converted to 64 bits (was 64-bit shift intended?)`.
Reviewed-by: Brian Paul <brianp@vmware.com>
Jose Fonseca [Fri, 8 Jan 2016 13:59:16 +0000 (13:59 +0000)]
mesa/main: Avoid `void function returning a value` warning.
Trivial.
Reviewed-by: Brian Paul <brianp@vmware.com>
Oded Gabbay [Thu, 7 Jan 2016 15:20:47 +0000 (17:20 +0200)]
configure.ac: add --enable-profile
For profiling mesa's code, especially llvmpipe, PROFILE should be
defined. Currently, this define can only be generated if mesa is
built using scons.
This patch makes it possible to generate this define also when building
mesa through automake tools.
v2:
- Change --enable-llvmpipe-profile to --enable-profile
- Add -fno-omit-frame-pointer to CFLAGS and CXXFLAGS when enabling profile
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Marek Olšák [Fri, 8 Jan 2016 01:11:16 +0000 (02:11 +0100)]
nine: allow fragment shader POSITION and FACE to be system values
Reported-by: Axel Davy <axel.davy@ens.fr>
Marek Olšák [Thu, 7 Jan 2016 22:14:55 +0000 (23:14 +0100)]
vl: allow fragment shader POSITION to be a system value
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Thu, 7 Jan 2016 18:48:56 +0000 (19:48 +0100)]
util/pstipple: allow fragment shader POSITION to be a system value
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Sat, 2 Jan 2016 21:45:10 +0000 (22:45 +0100)]
st/mesa: add support for POSITION and FACE system values
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Fri, 8 Jan 2016 00:45:34 +0000 (01:45 +0100)]
tgsi/scan: update for POSITION and FACE sytem values
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Sat, 2 Jan 2016 19:45:00 +0000 (20:45 +0100)]
gallium: add caps for POSITION and FACE system values
v2: document the integer behavior
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Sat, 2 Jan 2016 22:08:27 +0000 (23:08 +0100)]
program: add a helper for rewriting FP position input to sysval
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Sat, 2 Jan 2016 19:16:16 +0000 (20:16 +0100)]
glsl: optionally declare gl_FragCoord & gl_FrontFacing as system values
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Thu, 7 Jan 2016 22:37:53 +0000 (23:37 +0100)]
tgsi/ureg: handle redundant declarations in ureg_DECL_system_value
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Marek Olšák [Thu, 7 Jan 2016 22:25:48 +0000 (23:25 +0100)]
tgsi/ureg: remove index parameter from ureg_DECL_system_value
It can be trivially derived from the number of already declared system
values. This allows ureg users not to worry about which index to choose.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Marek Olšák [Sat, 2 Jan 2016 18:58:26 +0000 (19:58 +0100)]
st/mesa: remove dead code from mesa_to_tgsi
These aren't part of ARB_fragment_program.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Edward O'Callaghan [Thu, 7 Jan 2016 16:44:46 +0000 (03:44 +1100)]
radeon, si: Use TGSI chan name defines in lp_build_emit_fetch() calls
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Edward O'Callaghan [Thu, 7 Jan 2016 16:44:45 +0000 (03:44 +1100)]
gallium/aux: Use TGSI chan name defines inplace of literals
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Nicolai Hähnle [Thu, 7 Jan 2016 20:27:52 +0000 (15:27 -0500)]
mesa: check that internalformat of CopyTexImage*D is not 1, 2, 3, 4
The piglit copyteximage check has recently been augmented to test this, but
apparently it hasn't been fixed in Mesa so far.
This language also already appears in the OpenGL 2.1 spec (Ian).
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Jason Ekstrand [Wed, 6 Jan 2016 23:30:39 +0000 (15:30 -0800)]
i965/compiler: Enable more lowering in NIR
We don't need these for GLSL or ARB, but we need them for SPIR-V
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Jason Ekstrand [Wed, 6 Jan 2016 23:30:38 +0000 (15:30 -0800)]
nir/algebraic: Add more lowering
This commit adds lowering options for the following opcodes:
- nir_op_fmod
- nir_op_bitfield_insert
- nir_op_uadd_carry
- nir_op_usub_borrow
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Jason Ekstrand [Wed, 6 Jan 2016 23:30:37 +0000 (15:30 -0800)]
nir/opcodes: Fix up uadd_carry and usub_borrow
Both were defined as returning bool but the gpu_shader5 functions are
defined to return int. Also, we had the parameters for usub borrwo
backwards in the folding expression.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Ilia Mirkin [Sat, 2 Jan 2016 16:38:42 +0000 (11:38 -0500)]
nvc0: add ARB_indirect_parameters support
I chose to make separate macros for this due to the additional
complexity and extra scratch usage.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Thu, 31 Dec 2015 21:17:19 +0000 (16:17 -0500)]
st/mesa: expose ARB_indirect_parameters when the backend driver allows
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Ilia Mirkin [Thu, 31 Dec 2015 21:11:56 +0000 (16:11 -0500)]
mesa: add support for ARB_indirect_parameters draw functions
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Ilia Mirkin [Thu, 31 Dec 2015 20:47:17 +0000 (15:47 -0500)]
mesa: add parameter buffer, used for ARB_indirect_parameters
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Ilia Mirkin [Thu, 31 Dec 2015 20:19:51 +0000 (15:19 -0500)]
glapi: add ARB_indirect_parameters definitions
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Ilia Mirkin [Sat, 2 Jan 2016 05:45:56 +0000 (00:45 -0500)]
nvc0: add support for real ARB_multi_draw_indirect
The draw groups are now split up into groups of 32 if there's a
non-packed stride, or in groups of 400-500 if the draw data is packed.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Sat, 2 Jan 2016 05:06:22 +0000 (00:06 -0500)]
nvc0: adjust indirect draw macros to handle multiple draws at once
These are still invoked one at a time, but the underlying macro can
handle multiple draws.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Thu, 31 Dec 2015 19:11:07 +0000 (14:11 -0500)]
st/mesa: add support for new mesa indirect draw interface
This shifts all indirect draws to go through the new function. If the
driver doesn't have support for multi draws, we break those up and
perform N draws. Otherwise, we pass everything through for just a single
draw call.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Ilia Mirkin [Thu, 31 Dec 2015 18:30:13 +0000 (13:30 -0500)]
gallium: add caps to expose support for multi indirect draws
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Ilia Mirkin [Thu, 31 Dec 2015 18:07:49 +0000 (13:07 -0500)]
gallium: add sufficient draw interface to allow new indirect features
This makes it possible to support indirect multidraws as well as having
the number of such draws to come from a separate GPU resource.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Ilia Mirkin [Wed, 30 Dec 2015 23:10:56 +0000 (18:10 -0500)]
vbo: create a new draw function interface for indirect draws
All indirect draws are passed to the new draw function. By default
there's a fallback implementation which pipes it right back to
draw_prims, but eventually both the fallback and draw_prim's support for
indirect drawing should be removed.
This should allow a backend to properly support ARB_multi_draw_indirect
and ARB_indirect_parameters.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Roland Scheidegger [Sat, 2 Jan 2016 03:59:09 +0000 (04:59 +0100)]
llvmpipe: do 64bit plane calculations in the sse path
The sse path was pretty much disabled for practical purposes because the
largest allowed fb size was 128x128. So, adapt it for 64bit plane calculations.
This is actually not that difficult, though a problem is that we can't do
a signed 32x32->64bit mul, only unsigned, so need to fix that up. Overall,
the code still looks reasonable, though it's not like changes there in
setup really make much of a difference in the end...
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Roland Scheidegger [Sat, 2 Jan 2016 03:58:37 +0000 (04:58 +0100)]
llvmpipe: don't store eo as 64bit int
eo, just like dcdx and dcdy, cannot overflow 32bit.
Store it as unsigned though just in case (it cannot be negative, but
in theory twice as big as dcdx or dcdy so this gives it one more bit).
This doesn't really change anything, albeit it might help minimally on
32bit archs.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Roland Scheidegger [Thu, 31 Dec 2015 02:20:38 +0000 (03:20 +0100)]
llvmpipe: use aligned data for the assembly program in setup
Back in the day (before
24678700edaf5bb9da9be93a1367f1a24cfaa471) the values
were not actually in a struct but even then I can't see why we didn't simply
align the values. Especially since it's trivial to do so.
(Not that it actually matters since the code is pretty much unused for now.)
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Roland Scheidegger [Thu, 7 Jan 2016 18:38:15 +0000 (19:38 +0100)]
draw: initialize prim header flags when clipping lines
Otherwise, clipped lines would have undefined stippling reset bit if line
stippling is enabled.
(Untested, and I just assume copying over the bits from the original line
is actually the right thing to do.)
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Roland Scheidegger [Wed, 6 Jan 2016 22:49:30 +0000 (23:49 +0100)]
draw: fix line stippling with unfilled prims
The unfilled stage was not filling in the prim header, and the line stage
then decided to reset the stipple counter or not based on the uninitialized
data. This causes some failures in conform linestipple test (albeit quite
randomly happening depending on environment).
So fill in the prim header in the unfilled stage - I am not entirely sure
if anybody really needs determinant after that stage, but there's at least
later stages (wide line for instance) which copy over the determinant as well.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Timothy Arceri [Tue, 14 Jul 2015 13:30:27 +0000 (23:30 +1000)]
glsl: replace null check with assert
This was added in
54f583a20 since then error handling has improved.
The test this was added to fix now fails earlier since
01822706ec
Reviewed-by: Matt Turner <mattst88@gmail.com>
Nicolai Hähnle [Wed, 6 Jan 2016 02:51:27 +0000 (21:51 -0500)]
i965: use _mesa_delete_buffer_object
This is more future-proof, plugs the memory leak of Label and properly
destroys the buffer mutex.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Nicolai Hähnle [Wed, 6 Jan 2016 02:51:13 +0000 (21:51 -0500)]
i915: use _mesa_delete_buffer_object
This is more future-proof, plugs the memory leak of Label and properly
destroys the buffer mutex.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Nicolai Hähnle [Wed, 6 Jan 2016 02:49:37 +0000 (21:49 -0500)]
radeon: use _mesa_delete_buffer_object
This is more future-proof, plugs the memory leak of Label and properly
destroys the buffer mutex.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Nicolai Hähnle [Wed, 6 Jan 2016 02:49:11 +0000 (21:49 -0500)]
st/mesa: use _mesa_delete_buffer_object
This is more future-proof than the current code.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Nicolai Hähnle [Wed, 6 Jan 2016 02:47:04 +0000 (21:47 -0500)]
mesa/bufferobj: make _mesa_delete_buffer_object externally accessible
gl_buffer_object has grown more complicated and requires cleanup. Using this
function from drivers will be more future-proof.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Oded Gabbay [Thu, 7 Jan 2016 17:50:12 +0000 (19:50 +0200)]
llvmpipe: use sse2 conv code for altivec
In lp_build_conv() and lp_build_conv_auto(), there is a special case of
conversion when sse2 is present. That code path is suitable without any
changes to altivec, because all the functions that are called in that
code path already support altivec.
This patch increase the FPS in POWER arch across the board
between 10%-25%
I checked ipers, glxgears, glxspheres64, openarena, xonotic and glmark2.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Marek Olšák [Wed, 6 Jan 2016 01:30:13 +0000 (02:30 +0100)]
radeonsi: adjust the parameters of si_shader_dump
The function will be extended to dump all binaries shaders will consist of,
so si_shader* makes sense here.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 3 Jan 2016 16:18:04 +0000 (17:18 +0100)]
radeonsi: move si_shader_dump call out of si_compile_llvm
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 3 Jan 2016 16:05:05 +0000 (17:05 +0100)]
radeonsi: inline si_shader_binary_read
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 3 Jan 2016 16:03:24 +0000 (17:03 +0100)]
radeonsi: move si_shader_dump call out of si_shader_binary_read
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 3 Jan 2016 15:39:24 +0000 (16:39 +0100)]
radeonsi: separate shader dumping code to si_shader_dump and *_dump_stats
Eventually, I'd like to dump stats for several combined binaries, which is
why you don't see a binary parameter in si_shader_dump_stats
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 27 Dec 2015 23:53:29 +0000 (00:53 +0100)]
radeonsi: add si_shader_destroy_binary
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Mon, 28 Dec 2015 00:45:00 +0000 (01:45 +0100)]
radeonsi: don't pass si_shader to si_compile_llvm
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 27 Dec 2015 22:47:00 +0000 (23:47 +0100)]
radeonsi: move si_shader_binary_upload out of si_compile_llvm
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 27 Dec 2015 22:35:08 +0000 (23:35 +0100)]
radeonsi: always keep shader code, rodata, and relocs in memory
We won't compile shaders in draw calls, but we will concatenate shader
binaries according to states in draw calls, so keep the binaries.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Mon, 28 Dec 2015 00:45:00 +0000 (01:45 +0100)]
radeonsi: don't pass si_shader to si_shader_binary_read
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Mon, 28 Dec 2015 00:45:00 +0000 (01:45 +0100)]
radeonsi: don't pass si_shader to si_shader_binary_read_config
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 27 Dec 2015 23:14:05 +0000 (00:14 +0100)]
radeonsi: add struct si_shader_config
There will be 1 config per variant, which will be a union of configs
from {prolog, main, epilog}. For now, just add the structure.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 27 Dec 2015 19:05:19 +0000 (20:05 +0100)]
radeonsi: move NULL exporting into a separate function
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 27 Dec 2015 19:02:41 +0000 (20:02 +0100)]
radeonsi: move MRT color exporting into a separate function
This will be used by a fragment shader epilog.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 27 Dec 2015 18:36:33 +0000 (19:36 +0100)]
radeonsi: use EXP_NULL for pixel shaders without outputs
This never happens currently.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 27 Dec 2015 16:53:44 +0000 (17:53 +0100)]
radeonsi: only use LLVMBuildLoad once when updating color outputs at the end
without LLVMBuildStore.
So:
- do LLVMBuildLoad
- update the values as necessary
- export
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 27 Dec 2015 16:45:52 +0000 (17:45 +0100)]
radeonsi: export "undef" values for undefined PS outputs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 27 Dec 2015 16:38:37 +0000 (17:38 +0100)]
radeonsi: move MRTZ export into a separate function
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Wed, 23 Dec 2015 17:06:04 +0000 (18:06 +0100)]
radeonsi: simplify setting the DONE bit for PS exports
First find out what the last export is and simply set the DONE bit there.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Wed, 23 Dec 2015 15:43:54 +0000 (16:43 +0100)]
radeonsi: set SPI color formats and CB_SHADER_MASK outside of compilation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Wed, 23 Dec 2015 15:24:02 +0000 (16:24 +0100)]
radeonsi: write all MRTs only if there is exactly one output
This doesn't fix a known bug, but better safe than sorry.
Also, simplify the expression in si_shader.c.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Wed, 23 Dec 2015 15:02:46 +0000 (16:02 +0100)]
radeonsi: determine SPI_SHADER_Z_FORMAT outside of shader compilation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Wed, 23 Dec 2015 14:36:05 +0000 (15:36 +0100)]
radeonsi: determine DB_SHADER_CONTROL outside of shader compilation
because the API pixel shader binary will not emulate alpha test one day,
so the KILL_ENABLE bit must be determined elsewhere.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Fri, 1 Jan 2016 18:42:44 +0000 (19:42 +0100)]
tgsi/scan: set which color components are read by a fragment shader
This will be used by radeonsi.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sat, 2 Jan 2016 16:28:19 +0000 (17:28 +0100)]
tgsi/scan: fix tgsi_shader_info::reads_z
This has no users in Mesa.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Wed, 23 Dec 2015 02:01:32 +0000 (03:01 +0100)]
tgsi/scan: set if a fragment shader writes sample mask
This will be used by radeonsi.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Kenneth Graunke [Tue, 5 Jan 2016 13:34:24 +0000 (05:34 -0800)]
glsl: Disallow vectorization of vector_insert/extract.
vector_insert takes a vector, a scalar location, and a scalar value,
and produces a new vector with that component updated. As such, it
can't be vectorized properly.
vector_extract takes a vector and a scalar location, and returns
that scalar component of the vector. Vectorization doesn't really
make any sense.
Treating both as horizontal operations makes sure the vectorizer
won't try to touch these.
Found by inspection.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Roland Scheidegger [Tue, 22 Dec 2015 02:42:33 +0000 (03:42 +0100)]
softpipe: tell draw about the vertex layout we want
This makes it more similar to llvmpipe. It also allows us to let draw emit
code handle things like getting zeros for non-existing vs outputs
automatically. There probably isn't really any overhead either way, there isn't
really any "simply copy everything" code in the emit path it would copy each
attrib individually just the same. Likewise, we still do another mapping step
in softpipe as the layout may still not match exactly (same as in llvmpipe,
should probably nuke the pointless mapping in both drivers).
This fixes the piglit arb_fragment_layer_viewport no_gs/no_write tests.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Roland Scheidegger [Sat, 19 Dec 2015 05:12:27 +0000 (06:12 +0100)]
llvmpipe: use ints not unsigned for slots
They can't actually be 0 (as position is there) but should avoid confusion.
This was supposed to have been done by
af7ba989fb5a39925a0a1261ed281fe7f48a16cf
but I accidentally pushed an older version of the patch in the end...
Also prettify slightly. And make some notes about the confusing and useless
fs input "map".
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Roland Scheidegger [Sat, 19 Dec 2015 02:43:14 +0000 (03:43 +0100)]
draw: nuke the interp parameter from vertex_info
draw emit couldn't care less what the interpolation mode is...
This somehow looked like it would matter, all drivers more or less
dutifully filled that in correctly. But this is only used for emit,
if draw needs to know about interpolation mode (for clipping for instance)
it will get that information from the vs anyway.
softpipe actually used to depend on that interpolation parameter, as it
abused that structure quite a bit but no longer.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Roland Scheidegger [Sat, 19 Dec 2015 02:37:17 +0000 (03:37 +0100)]
softpipe: don't abuse the draw vertex_info struct for something different
softpipe would calculate two "vertex layouts". The second one was however
just used for internal purposes, draw would know nothing about it even though
it looked exactly the same as the other one we tell draw about.
So, store that information separately as this was just confusing.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Roland Scheidegger [Sat, 19 Dec 2015 01:33:25 +0000 (02:33 +0100)]
softpipe: fix mapping of "special" vs outputs
Unlike llvmpipe, softpipe always tells draw to emit the vertices as-is.
The two vertex layouts it calculates are a bit confusing, one which is just
used to tell draw to emit vertices as-is, and the other which has draw written
all over it but draw is completely unaware of and is used only to look up the
correct interpolation info later in setup.
Thus, the slots used are different to what llvmpipe does (I'm going to clean
up the confusing two layout stuff).
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Roland Scheidegger [Fri, 18 Dec 2015 20:44:06 +0000 (21:44 +0100)]
llvmpipe: scratch some special handling of vp_index/layer
It was actually slightly buggy (missing initialization / setup not dependent
on new vs albeit I didn't see issues), but the case of non-existing attributes
is now handled by draw emit code so don't need that anymore.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Roland Scheidegger [Thu, 7 Jan 2016 00:52:39 +0000 (01:52 +0100)]
draw: rework handling of non-existing outputs in emit code
Previously the code would just redirect requests for attributes which
don't exist to use output 0. Rework this to output all zeros instead which
seems more useful - in particular some extensions like
ARB_fragment_layer_viewport require 0 in the fs even if it wasn't output by
previous stages. That way, drivers don't have to special case this depending
if the vs/gs outputs some attribute or not.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Sarah Sharp [Mon, 21 Sep 2015 21:22:53 +0000 (14:22 -0700)]
mesa: Add KBL PCI IDs and platform information.
Add PCI IDs for the Intel Kabylake platforms. The IDs are taken
directly from the Linux kernel patches, which are under review:
http://lists.freedesktop.org/archives/intel-gfx/2015-October/078967.html
http://cgit.freedesktop.org/~vivijim/drm-intel/log/?h=kbl-upstream-v2
The Kabylake PCI IDs taken from the kernel are rearranged to be in order
of GT type, then PCI ID.
Please note that if this patch is backported, the following fixes will
need to be added before this patch:
commit
28ed1e08e8ba98e "i965/skl: Remove early platform support"
commit
c1e38ad37042b0e "i965/skl: Use larger URB size where available."
Thanks to Ben for fixing a bug around setting urb.size, and being
patient with my questions about what the various fields mean.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Suggested-by: Ben Widawsky <benjamin.widawsky@intel.com>
Tested-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (KBL-GT2)
Cc: "11.1" <mesa-stable@lists.freedesktop.org>
Sinclair Yeh [Thu, 10 Dec 2015 22:26:29 +0000 (14:26 -0800)]
svga: Rename SVGA_HINT_FLAG_DRAW_EMITTED
Rename SVGA_HINT_FLAG_DRAW_EMITTED to SVGA_HINT_FLAG_CAN_PRE_FLUSH
because preemptive flush can be unblocked by more commands than
draw.
Reviewed-by: Brian Paul <brianp@vmware.com>
Sinclair Yeh [Wed, 9 Dec 2015 23:05:49 +0000 (15:05 -0800)]
svga: allow preemptive flushing on DMA, update, and readback commands
The existing code effectively turns off preemptive flushing for all
but the regions used for draws. This turns out to be overly
restrictive as some memory regions, e.g. GMR, may never get a draw
when used as a DMA upload staging area, causing problems for apps
that upload a large amount of textures, e.g. Unigine Heaven.
This patch fixes the Unigine Heaven memory allocation error and
has been verified to not cause a regression in the previous extended
retina display issue.
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Charmaine Lee [Mon, 4 Jan 2016 18:36:48 +0000 (10:36 -0800)]
svga: skip vertex attribute instruction with zero usage_mask
In emit_input_declarations(), we are skipping declarations for those
registers that are not being used. But in emit_vertex_attrib_instructions(),
we are still emitting instructions to tweak the vertex attributes even if
they are not being used. This causes an assert in the backend because an
input register is not declared in the shader. This patch fixes the problem
by skipping the instruction if the vertex attribute is not being used.
Changes in this patch is originated from the code snippet from Jose as
suggested in bug
1530161.
Tested with piglit, Heaven, Turbine, glretrace.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Brian Paul [Wed, 6 Jan 2016 22:45:08 +0000 (15:45 -0700)]
st/mesa: minor clean-ups in st_atom.c
Remove useless comment. Reformat code.
Brian Paul [Wed, 6 Jan 2016 18:48:52 +0000 (11:48 -0700)]
st/mesa: replace bitmap size checks with assertion
The _mesa_Bitmap() caller already checks for zero-sized bitmaps.
Brian Paul [Thu, 17 Dec 2015 21:16:24 +0000 (14:16 -0700)]
st/mesa: check texture target in allocate_full_mipmap()
Some kinds of textures never have mipmaps. 3D textures seldom have
mipmaps.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Thu, 17 Dec 2015 21:06:11 +0000 (14:06 -0700)]
st/mesa: move mipmap allocation check logic into a function
Better readability and easier to extend.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Wed, 6 Jan 2016 15:38:33 +0000 (08:38 -0700)]
main: s/GLuint/GLbitfield for state bitmasks
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Wed, 6 Jan 2016 15:38:03 +0000 (08:38 -0700)]
vbo: s/GLuint/GLbitfield/ for state bitmasks
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Wed, 6 Jan 2016 15:32:02 +0000 (08:32 -0700)]
st/mesa: use GLbitfield in st_state_flags, add comments
Use GLbitfield instead of GLuint to be consistent with other variables.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Wed, 6 Jan 2016 15:33:36 +0000 (08:33 -0700)]
s/GLuint/GLbitfield/ for st_invalidate_state() parameter
To match dd_function_table::UpdateState().
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Wed, 6 Jan 2016 01:11:14 +0000 (18:11 -0700)]
st/mesa: be more careful about state validation in st_Bitmap()
If the only dirty state is mesa's _NEW_PROGRAM_CONSTANTS flag, we can
skip state validation before drawing a bitmap since that state doesn't
effect bitmap rendering.
This further increases the performance of the ipers demo on llvmpipe
to about what it was before commit
36c93a6fae27561.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Wed, 6 Jan 2016 01:28:57 +0000 (18:28 -0700)]
st/mesa: move bitmap cache flushing out of state validation
Just do it where needed (before drawing, clearing, etc).
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Wed, 6 Jan 2016 00:10:12 +0000 (17:10 -0700)]
st/mesa: check state->mesa in early return check in st_validate_state()
We were checking the dirty->st flags but not the dirty->mesa flags.
When we took the early return, we didn't clear the dirty->mesa flags
so the next time we called st_validate_state() we'd often flush the
glBitmap cache. And since st_validate_state() is called from
st_Bitmap(), it meant we flushed the bitmap cache for every glBitmap()
call.
This change seems to recover most of the performance loss observed
with the ipers demo on llvmpipe since commit commit
36c93a6fae27561.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Wed, 6 Jan 2016 00:26:29 +0000 (17:26 -0700)]
st/mesa: protect debug printf() with a conditional instead of comment
Brian Paul [Wed, 6 Jan 2016 00:38:00 +0000 (17:38 -0700)]
st/mesa: fix comment indentation in st_flush_bitmap_cache()
Timothy Arceri [Wed, 6 Jan 2016 09:22:46 +0000 (20:22 +1100)]
glsl: fix varying slot allocation for blocks and structs with explicit locations
Previously each member was being counted as using a single slot,
count_attribute_slots() fixes the count for array and struct members.
Also don't assign a negitive to the unsigned expl_location variable.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>