mesa.git
8 years agonvc0: remove useless goto in nvc0_launch_grid()
Samuel Pitoiset [Mon, 11 Jan 2016 23:11:06 +0000 (00:11 +0100)]
nvc0: remove useless goto in nvc0_launch_grid()

Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agomesa: Mark Identity as const
Ian Romanick [Thu, 7 Jan 2016 23:10:16 +0000 (15:10 -0800)]
mesa: Mark Identity as const

I was going to send this as review for dce1e1a8, but I missed that
window.  This saves 64 bytes of unshared data and prelaces it with 96
bytes shared text.  My guess is that some of the calls to memcpy get
optimized to something else.

   text    data     bss     dec     hex filename
7847613  220208   27432 8095253  7b8615 i965_dri.so before
7847709  220144   27432 8095285  7b8635 i965_dri.so after

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Brian Paul <brianp@vmware.com>
8 years agoconfigure.ac: always define __STDC_CONSTANT_MACROS
Oded Gabbay [Mon, 11 Jan 2016 19:55:15 +0000 (21:55 +0200)]
configure.ac: always define __STDC_CONSTANT_MACROS

The ISO C99 standard (7.18.4) specifies that C++
implementations should define UINT64_C only when
__STDC_CONSTANT_MACROS is defined.

Because we now use UINT64_C in our cpp files (since commit
208bfc493debe0344d0b9cb93975981f14412628), we need to add this define.

This also solves compilation errors with GCC 4.8.x on ppc64le machines.

v2: add this define to SCons build system

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
8 years agoi965: Upload 3DSTATE_BINDING_TABLE_POINTERS_HS when !TCS on Gen9+.
Kenneth Graunke [Sun, 10 Jan 2016 23:01:03 +0000 (15:01 -0800)]
i965: Upload 3DSTATE_BINDING_TABLE_POINTERS_HS when !TCS on Gen9+.

Gen9+ requires us to emit 3DSTATE_BINDING_TABLE_POINTERS_HS for the
hull shader push constants to take effect.  The passthrough TCS uses
push constants for the default tessellation levels.  So, when those
change, we need to re-upload the binding table as well.

Fixes five Piglit tests on Skylake:
- spec/arb_tessellation_shader/vs-tes-vertex
- spec/arb_tessellation_shader/vs-tes-tessinner-tessouter-inputs-quads
- spec/arb_tessellation_shader/vs-tes-tessinner-tessouter-inputs-tris
- spec/arb_tessellation_shader/tes-read-texture
- spec/arb_tessellation_shader/tess_with_geometry

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
8 years agoAdd missing platform information for KBL
Mark Janes [Sat, 9 Jan 2016 00:30:20 +0000 (16:30 -0800)]
Add missing platform information for KBL

In testing KBL, I found:

 - urb size was not set for slices gt1.5, gt2, and gt3.  The value I
   used for these slices (384) was taken from an earlier patch authored
   by Ben Widawsky.

 - slice count was missing.  This field was added by
   a403ad4f5a034e52a3cd845e91c4aa3e6927b731

With this commit, KBL passes piglit at parity with SKL.

Note: As requested by Kristian, Sarah modified this patch to drop
setting urb size for gt1.5, gt2, and gt3, since the correct default is
set in the GEN9 macro by commit c1e38ad37042b0ec261eb0ba5631b7ff0ee7a9da
"i965/skl: Use larger URB size where available."

Signed-off-by: Mark Janes <mark.a.janes@intel.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
Cc: "11.1" <mesa-stable@lists.freedesktop.org>
8 years agonv50/ir: the whole point of data array is to hand out regular registers
Ilia Mirkin [Mon, 11 Jan 2016 17:58:19 +0000 (12:58 -0500)]
nv50/ir: the whole point of data array is to hand out regular registers

Fixes: 0d3051f75a (nv50/ir: Fix scratch allocation size and file)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agomesa/uniform_query: add IROUNDD and use for doubles->ints (v2)
Dave Airlie [Thu, 15 Oct 2015 04:07:40 +0000 (05:07 +0100)]
mesa/uniform_query: add IROUNDD and use for doubles->ints (v2)

For the case where we convert a double to an int, we should
round the same as we do for floats.

This fixes GL41-CTS.gpu_shader_fp64.state_query

v2: add IROUNDD (Ilia)

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoglsl: replace unreachable code path with assert
Timothy Arceri [Fri, 8 Jan 2016 04:25:37 +0000 (15:25 +1100)]
glsl: replace unreachable code path with assert

The lower_named_interface_blocks() pass is called before we try
assign locations to varyings so this shouldn't be reachable.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
8 years agoRevert "glsl: replace unreachable code path with assert"
Timothy Arceri [Sun, 10 Jan 2016 22:20:39 +0000 (09:20 +1100)]
Revert "glsl: replace unreachable code path with assert"

This reverts commit 98270fd20d4d58db8ae5af3b6f10ed6a81c058a6.

Something went terribly wrong the commit is not what the commit
message says.

8 years agoglsl: replace unreachable code path with assert
Timothy Arceri [Fri, 8 Jan 2016 04:25:37 +0000 (15:25 +1100)]
glsl: replace unreachable code path with assert

The lower_named_interface_blocks() pass is called before we try
assign locations to varyings so this shouldn't be reachable.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
8 years agoglsl: combine if blocks
Timothy Arceri [Fri, 8 Jan 2016 04:25:36 +0000 (15:25 +1100)]
glsl: combine if blocks

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
8 years agomesa: Update todo regarding StencilOp and StencilOpSeparate.
Rhys Kidd [Thu, 20 Aug 2015 13:03:29 +0000 (23:03 +1000)]
mesa: Update todo regarding StencilOp and StencilOpSeparate.

OpenGL 2.0 function StencilOp() is in part internally implemented via
StencilOpSeparate(). This change happened some time ago, however the
accompanying doxygen todo comment was not accordingly updated.

Replace the outdated portion of this doxygen todo comment, leaving the
remainder unchanged.

Also better respect the 80 character suggested line length in this file.

v2: Fully remove comment, following code review by t_arceri@yahoo.com.au

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
8 years agoglsl: Make bitfield_insert/extract and bfi/bfm non-vectorizable.
Kenneth Graunke [Thu, 7 Jan 2016 23:45:21 +0000 (15:45 -0800)]
glsl: Make bitfield_insert/extract and bfi/bfm non-vectorizable.

Currently, opt_vectorize() tries to combine:

    result.x = bitfieldInsert(src0.x, src1.x, src2.x, src3.x);
    result.y = bitfieldInsert(src0.y, src1.y, src2.y, src3.y);
    result.z = bitfieldInsert(src0.z, src1.z, src2.z, src3.z);
    result.w = bitfieldInsert(src0.w, src1.w, src2.w, src3.w);

into a single ir_quadop_bitfield_insert opcode, which operates on
ivec4s.  However, GLSL IR's opcodes currently require the bits and
offset parameters to be scalar integers.  So, this breaks.

We want to be able to vectorize this eventually, but for now, just
chicken out and make opt_vectorize() bail by marking all the bitfield
insert/extract related opcodes as horizontal.  This is a relatively
uncommon case today, so we'll do the simple fix for stable branches,
and fix it properly on master.

Fixes assertion failures when compiling Shadow of Mordor vertex shaders
on i965 in vec4 mode (where OptimizeForAOS enables opt_vectorize()).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
8 years agonv50/ir: Fix scratch allocation size and file
Pierre Moreau [Fri, 1 Jan 2016 12:09:42 +0000 (13:09 +0100)]
nv50/ir: Fix scratch allocation size and file

Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agomesa: merge bind_atomic_buffers_{base|range}
Nicolai Hähnle [Wed, 6 Jan 2016 22:30:18 +0000 (17:30 -0500)]
mesa: merge bind_atomic_buffers_{base|range}

Reduced code duplication should make the code more maintainable.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agomesa: merge bind_shader_storage_buffers_{base|range}
Nicolai Hähnle [Wed, 6 Jan 2016 22:26:14 +0000 (17:26 -0500)]
mesa: merge bind_shader_storage_buffers_{base|range}

Reduced code duplication should make the code more maintainable.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agomesa: merge bind_uniform_buffers_{base|range}
Nicolai Hähnle [Wed, 6 Jan 2016 22:20:57 +0000 (17:20 -0500)]
mesa: merge bind_uniform_buffers_{base|range}

Reduced code duplication should make the code more maintainable.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agomesa: merge bind_xfb_buffers_{base|range}
Nicolai Hähnle [Wed, 6 Jan 2016 20:47:01 +0000 (15:47 -0500)]
mesa: merge bind_xfb_buffers_{base|range}

Reduced code duplication should make the code more maintainable.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agoglsl: Don't add nir files to libglsl_la_SOURCES
Kristian Høgsberg Kristensen [Fri, 8 Jan 2016 23:23:56 +0000 (15:23 -0800)]
glsl: Don't add nir files to libglsl_la_SOURCES

SCons doesn't understand nir yet and doesn't want to compile the glsl to
nir pass. Move the files to their own variable so we can add it only for
automake.

Tested-by: Brian Paul <brianp@vmware.com>
8 years agonv50,nvc0: use a face sysval to avoid the useless back-and-forth conversion
Ilia Mirkin [Fri, 8 Jan 2016 22:32:56 +0000 (17:32 -0500)]
nv50,nvc0: use a face sysval to avoid the useless back-and-forth conversion

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agoglsl: Move _mesa_shader_stage_to_string/abbrev to shader_enums.c
Kristian Høgsberg Kristensen [Fri, 8 Jan 2016 20:35:48 +0000 (12:35 -0800)]
glsl: Move _mesa_shader_stage_to_string/abbrev to shader_enums.c

These are used by code that doesn't necessarily link to libglsl.la. Move
them to shader_enums.[ch] where we keep similar helpers.

Reviewed-by: Matt Turner <mattst88@gmail.com>
8 years agoi965: Move GLSL lowering passes out of libi965_compiler.la
Kristian Høgsberg Kristensen [Fri, 8 Jan 2016 20:35:38 +0000 (12:35 -0800)]
i965: Move GLSL lowering passes out of libi965_compiler.la

The scope of libi965_compiler.la is to be able to take nir shaders and
generate i965 EU code.  As such, we don't want the GLSL IR lowering
passes in the library. With this change, libi965_compiler.la no longer
needs to link to libglsl.la.

Reviewed-by: Matt Turner <mattst88@gmail.com>
8 years agoglsl: Move glsl_to_nir files to LIBGLSL_FILES
Kristian Høgsberg Kristensen [Fri, 8 Jan 2016 20:35:18 +0000 (12:35 -0800)]
glsl: Move glsl_to_nir files to LIBGLSL_FILES

libglsl_la_SOURCES includes both NIR_FILES and LIBGLSL_FILES, so for
libglsl.la consumers, this is a no-op. libnir.la however no longer uses
any GLSL IR infrastructure and can be used without also linking to
libglsl.la.

Acked-by: Matt Turner <mattst88@gmail.com>
8 years agomesa: Use separate indices for UBO & SSBO during binding
Jordan Justen [Sat, 24 Oct 2015 00:08:33 +0000 (17:08 -0700)]
mesa: Use separate indices for UBO & SSBO during binding

Previously we were treating the binding index for Uniform Buffer
Objects and Shader Storage Buffer Objects as being part of the
combined BufferInterfaceBlocks array.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93322
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
8 years agomesa: Map program UBOs and SSBOs to Interface Blocks
Jordan Justen [Sat, 24 Oct 2015 00:07:42 +0000 (17:07 -0700)]
mesa: Map program UBOs and SSBOs to Interface Blocks

v2:
 * Fill UboInterfaceBlockIndex and SsboInterfaceBlockIndex in
   split_ubos_and_ssbos (Iago)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
8 years agomesa: docs: Add link to planet.freedesktop.org
Sarah Sharp [Thu, 29 Oct 2015 22:56:18 +0000 (15:56 -0700)]
mesa: docs: Add link to planet.freedesktop.org

The freedesktop.org blog feeds aren't mentioned on either mesa3d.org or
any of the graphics project wikis (including the DRI wiki) on
freedeskop.org.  Fix that by linking to it from the sidebar.

Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agofreedreno: add ir3_compiler to gitignore
Ilia Mirkin [Fri, 8 Jan 2016 20:09:26 +0000 (15:09 -0500)]
freedreno: add ir3_compiler to gitignore

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agogallium: add a RESQ opcode to query info about a resource
Ilia Mirkin [Mon, 14 Dec 2015 03:11:25 +0000 (22:11 -0500)]
gallium: add a RESQ opcode to query info about a resource

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agogallium: add PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT
Ilia Mirkin [Sun, 3 Jan 2016 02:56:45 +0000 (21:56 -0500)]
gallium: add PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agogallium: add PIPE_SHADER_CAP_MAX_SHADER_BUFFERS
Ilia Mirkin [Sun, 27 Sep 2015 00:27:42 +0000 (20:27 -0400)]
gallium: add PIPE_SHADER_CAP_MAX_SHADER_BUFFERS

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agotgsi: update atomic op docs
Ilia Mirkin [Sun, 27 Sep 2015 05:23:38 +0000 (01:23 -0400)]
tgsi: update atomic op docs

Specify that the operation only applies to the x component, not
per-component as previously specified. This is unnecessary for GL and
creates additional complications for images which need to support these
operations as well.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agotgsi: add a is_store property
Ilia Mirkin [Sat, 26 Sep 2015 21:35:41 +0000 (17:35 -0400)]
tgsi: add a is_store property

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agotgsi: provide a way to encode memory qualifiers for SSBO
Ilia Mirkin [Sat, 7 Nov 2015 07:25:20 +0000 (02:25 -0500)]
tgsi: provide a way to encode memory qualifiers for SSBO

Each load/store on most hardware can specify what caching to do. Since
SSBO allows individual variables to also have separate caching modes,
allow loads/stores to have the qualifiers instead of attempting to
encode them in declarations.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agoureg: add buffer support to ureg
Ilia Mirkin [Sat, 19 Sep 2015 22:19:13 +0000 (18:19 -0400)]
ureg: add buffer support to ureg

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agotgsi: add ureg support for image decls
Ilia Mirkin [Sat, 20 Sep 2014 06:54:16 +0000 (02:54 -0400)]
tgsi: add ureg support for image decls

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agoglsl: Ensure 64bits shift is used.
Jose Fonseca [Fri, 8 Jan 2016 14:03:38 +0000 (14:03 +0000)]
glsl: Ensure 64bits shift is used.

I believe that `1u << x`, where x >= 32 yields undefined results
according to the C standard.

Particularly MSVC says `warning C4334: '<<' : result of 32-bit shift
implicitly converted to 64 bits (was 64-bit shift intended?)`.

Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agomesa/main: Avoid `void function returning a value` warning.
Jose Fonseca [Fri, 8 Jan 2016 13:59:16 +0000 (13:59 +0000)]
mesa/main: Avoid `void function returning a value` warning.

Trivial.

Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agoconfigure.ac: add --enable-profile
Oded Gabbay [Thu, 7 Jan 2016 15:20:47 +0000 (17:20 +0200)]
configure.ac: add --enable-profile

For profiling mesa's code, especially llvmpipe, PROFILE should be
defined. Currently, this define can only be generated if mesa is
built using scons.
This patch makes it possible to generate this define also when building
mesa through automake tools.

v2:

- Change --enable-llvmpipe-profile to --enable-profile
- Add -fno-omit-frame-pointer to CFLAGS and CXXFLAGS when enabling profile

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
8 years agonine: allow fragment shader POSITION and FACE to be system values
Marek Olšák [Fri, 8 Jan 2016 01:11:16 +0000 (02:11 +0100)]
nine: allow fragment shader POSITION and FACE to be system values

Reported-by: Axel Davy <axel.davy@ens.fr>
8 years agovl: allow fragment shader POSITION to be a system value
Marek Olšák [Thu, 7 Jan 2016 22:14:55 +0000 (23:14 +0100)]
vl: allow fragment shader POSITION to be a system value

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com
Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agoutil/pstipple: allow fragment shader POSITION to be a system value
Marek Olšák [Thu, 7 Jan 2016 18:48:56 +0000 (19:48 +0100)]
util/pstipple: allow fragment shader POSITION to be a system value

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com
Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agost/mesa: add support for POSITION and FACE system values
Marek Olšák [Sat, 2 Jan 2016 21:45:10 +0000 (22:45 +0100)]
st/mesa: add support for POSITION and FACE system values

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com
Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agotgsi/scan: update for POSITION and FACE sytem values
Marek Olšák [Fri, 8 Jan 2016 00:45:34 +0000 (01:45 +0100)]
tgsi/scan: update for POSITION and FACE sytem values

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com
Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agogallium: add caps for POSITION and FACE system values
Marek Olšák [Sat, 2 Jan 2016 19:45:00 +0000 (20:45 +0100)]
gallium: add caps for POSITION and FACE system values

v2: document the integer behavior

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com
Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agoprogram: add a helper for rewriting FP position input to sysval
Marek Olšák [Sat, 2 Jan 2016 22:08:27 +0000 (23:08 +0100)]
program: add a helper for rewriting FP position input to sysval

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com
Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agoglsl: optionally declare gl_FragCoord & gl_FrontFacing as system values
Marek Olšák [Sat, 2 Jan 2016 19:16:16 +0000 (20:16 +0100)]
glsl: optionally declare gl_FragCoord & gl_FrontFacing as system values

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com
Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agotgsi/ureg: handle redundant declarations in ureg_DECL_system_value
Marek Olšák [Thu, 7 Jan 2016 22:37:53 +0000 (23:37 +0100)]
tgsi/ureg: handle redundant declarations in ureg_DECL_system_value

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
8 years agotgsi/ureg: remove index parameter from ureg_DECL_system_value
Marek Olšák [Thu, 7 Jan 2016 22:25:48 +0000 (23:25 +0100)]
tgsi/ureg: remove index parameter from ureg_DECL_system_value

It can be trivially derived from the number of already declared system
values. This allows ureg users not to worry about which index to choose.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
8 years agost/mesa: remove dead code from mesa_to_tgsi
Marek Olšák [Sat, 2 Jan 2016 18:58:26 +0000 (19:58 +0100)]
st/mesa: remove dead code from mesa_to_tgsi

These aren't part of ARB_fragment_program.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
8 years agoradeon, si: Use TGSI chan name defines in lp_build_emit_fetch() calls
Edward O'Callaghan [Thu, 7 Jan 2016 16:44:46 +0000 (03:44 +1100)]
radeon, si: Use TGSI chan name defines in lp_build_emit_fetch() calls

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agogallium/aux: Use TGSI chan name defines inplace of literals
Edward O'Callaghan [Thu, 7 Jan 2016 16:44:45 +0000 (03:44 +1100)]
gallium/aux: Use TGSI chan name defines inplace of literals

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agomesa: check that internalformat of CopyTexImage*D is not 1, 2, 3, 4
Nicolai Hähnle [Thu, 7 Jan 2016 20:27:52 +0000 (15:27 -0500)]
mesa: check that internalformat of CopyTexImage*D is not 1, 2, 3, 4

The piglit copyteximage check has recently been augmented to test this, but
apparently it hasn't been fixed in Mesa so far.

This language also already appears in the OpenGL 2.1 spec (Ian).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agoi965/compiler: Enable more lowering in NIR
Jason Ekstrand [Wed, 6 Jan 2016 23:30:39 +0000 (15:30 -0800)]
i965/compiler: Enable more lowering in NIR

We don't need these for GLSL or ARB, but we need them for SPIR-V

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
8 years agonir/algebraic: Add more lowering
Jason Ekstrand [Wed, 6 Jan 2016 23:30:38 +0000 (15:30 -0800)]
nir/algebraic: Add more lowering

This commit adds lowering options for the following opcodes:

 - nir_op_fmod
 - nir_op_bitfield_insert
 - nir_op_uadd_carry
 - nir_op_usub_borrow

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
8 years agonir/opcodes: Fix up uadd_carry and usub_borrow
Jason Ekstrand [Wed, 6 Jan 2016 23:30:37 +0000 (15:30 -0800)]
nir/opcodes: Fix up uadd_carry and usub_borrow

Both were defined as returning bool but the gpu_shader5 functions are
defined to return int.  Also, we had the parameters for usub borrwo
backwards in the folding expression.

Reviewed-by: Matt Turner <mattst88@gmail.com>
8 years agonvc0: add ARB_indirect_parameters support
Ilia Mirkin [Sat, 2 Jan 2016 16:38:42 +0000 (11:38 -0500)]
nvc0: add ARB_indirect_parameters support

I chose to make separate macros for this due to the additional
complexity and extra scratch usage.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agost/mesa: expose ARB_indirect_parameters when the backend driver allows
Ilia Mirkin [Thu, 31 Dec 2015 21:17:19 +0000 (16:17 -0500)]
st/mesa: expose ARB_indirect_parameters when the backend driver allows

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agomesa: add support for ARB_indirect_parameters draw functions
Ilia Mirkin [Thu, 31 Dec 2015 21:11:56 +0000 (16:11 -0500)]
mesa: add support for ARB_indirect_parameters draw functions

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agomesa: add parameter buffer, used for ARB_indirect_parameters
Ilia Mirkin [Thu, 31 Dec 2015 20:47:17 +0000 (15:47 -0500)]
mesa: add parameter buffer, used for ARB_indirect_parameters

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agoglapi: add ARB_indirect_parameters definitions
Ilia Mirkin [Thu, 31 Dec 2015 20:19:51 +0000 (15:19 -0500)]
glapi: add ARB_indirect_parameters definitions

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agonvc0: add support for real ARB_multi_draw_indirect
Ilia Mirkin [Sat, 2 Jan 2016 05:45:56 +0000 (00:45 -0500)]
nvc0: add support for real ARB_multi_draw_indirect

The draw groups are now split up into groups of 32 if there's a
non-packed stride, or in groups of 400-500 if the draw data is packed.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agonvc0: adjust indirect draw macros to handle multiple draws at once
Ilia Mirkin [Sat, 2 Jan 2016 05:06:22 +0000 (00:06 -0500)]
nvc0: adjust indirect draw macros to handle multiple draws at once

These are still invoked one at a time, but the underlying macro can
handle multiple draws.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agost/mesa: add support for new mesa indirect draw interface
Ilia Mirkin [Thu, 31 Dec 2015 19:11:07 +0000 (14:11 -0500)]
st/mesa: add support for new mesa indirect draw interface

This shifts all indirect draws to go through the new function. If the
driver doesn't have support for multi draws, we break those up and
perform N draws. Otherwise, we pass everything through for just a single
draw call.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agogallium: add caps to expose support for multi indirect draws
Ilia Mirkin [Thu, 31 Dec 2015 18:30:13 +0000 (13:30 -0500)]
gallium: add caps to expose support for multi indirect draws

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agogallium: add sufficient draw interface to allow new indirect features
Ilia Mirkin [Thu, 31 Dec 2015 18:07:49 +0000 (13:07 -0500)]
gallium: add sufficient draw interface to allow new indirect features

This makes it possible to support indirect multidraws as well as having
the number of such draws to come from a separate GPU resource.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agovbo: create a new draw function interface for indirect draws
Ilia Mirkin [Wed, 30 Dec 2015 23:10:56 +0000 (18:10 -0500)]
vbo: create a new draw function interface for indirect draws

All indirect draws are passed to the new draw function. By default
there's a fallback implementation which pipes it right back to
draw_prims, but eventually both the fallback and draw_prim's support for
indirect drawing should be removed.

This should allow a backend to properly support ARB_multi_draw_indirect
and ARB_indirect_parameters.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agollvmpipe: do 64bit plane calculations in the sse path
Roland Scheidegger [Sat, 2 Jan 2016 03:59:09 +0000 (04:59 +0100)]
llvmpipe: do 64bit plane calculations in the sse path

The sse path was pretty much disabled for practical purposes because the
largest allowed fb size was 128x128. So, adapt it for 64bit plane calculations.
This is actually not that difficult, though a problem is that we can't do
a signed 32x32->64bit mul, only unsigned, so need to fix that up. Overall,
the code still looks reasonable, though it's not like changes there in
setup really make much of a difference in the end...

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agollvmpipe: don't store eo as 64bit int
Roland Scheidegger [Sat, 2 Jan 2016 03:58:37 +0000 (04:58 +0100)]
llvmpipe: don't store eo as 64bit int

eo, just like dcdx and dcdy, cannot overflow 32bit.
Store it as unsigned though just in case (it cannot be negative, but
in theory twice as big as dcdx or dcdy so this gives it one more bit).
This doesn't really change anything, albeit it might help minimally on
32bit archs.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agollvmpipe: use aligned data for the assembly program in setup
Roland Scheidegger [Thu, 31 Dec 2015 02:20:38 +0000 (03:20 +0100)]
llvmpipe: use aligned data for the assembly program in setup

Back in the day (before 24678700edaf5bb9da9be93a1367f1a24cfaa471) the values
were not actually in a struct but even then I can't see why we didn't simply
align the values. Especially since it's trivial to do so.
(Not that it actually matters since the code is pretty much unused for now.)

Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
8 years agodraw: initialize prim header flags when clipping lines
Roland Scheidegger [Thu, 7 Jan 2016 18:38:15 +0000 (19:38 +0100)]
draw: initialize prim header flags when clipping lines

Otherwise, clipped lines would have undefined stippling reset bit if line
stippling is enabled.
(Untested, and I just assume copying over the bits from the original line
is actually the right thing to do.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
8 years agodraw: fix line stippling with unfilled prims
Roland Scheidegger [Wed, 6 Jan 2016 22:49:30 +0000 (23:49 +0100)]
draw: fix line stippling with unfilled prims

The unfilled stage was not filling in the prim header, and the line stage
then decided to reset the stipple counter or not based on the uninitialized
data. This causes some failures in conform linestipple test (albeit quite
randomly happening depending on environment).
So fill in the prim header in the unfilled stage - I am not entirely sure
if anybody really needs determinant after that stage, but there's at least
later stages (wide line for instance) which copy over the determinant as well.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agoglsl: replace null check with assert
Timothy Arceri [Tue, 14 Jul 2015 13:30:27 +0000 (23:30 +1000)]
glsl: replace null check with assert

This was added in 54f583a20 since then error handling has improved.

The test this was added to fix now fails earlier since 01822706ec

Reviewed-by: Matt Turner <mattst88@gmail.com>
8 years agoi965: use _mesa_delete_buffer_object
Nicolai Hähnle [Wed, 6 Jan 2016 02:51:27 +0000 (21:51 -0500)]
i965: use _mesa_delete_buffer_object

This is more future-proof, plugs the memory leak of Label and properly
destroys the buffer mutex.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agoi915: use _mesa_delete_buffer_object
Nicolai Hähnle [Wed, 6 Jan 2016 02:51:13 +0000 (21:51 -0500)]
i915: use _mesa_delete_buffer_object

This is more future-proof, plugs the memory leak of Label and properly
destroys the buffer mutex.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agoradeon: use _mesa_delete_buffer_object
Nicolai Hähnle [Wed, 6 Jan 2016 02:49:37 +0000 (21:49 -0500)]
radeon: use _mesa_delete_buffer_object

This is more future-proof, plugs the memory leak of Label and properly
destroys the buffer mutex.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agost/mesa: use _mesa_delete_buffer_object
Nicolai Hähnle [Wed, 6 Jan 2016 02:49:11 +0000 (21:49 -0500)]
st/mesa: use _mesa_delete_buffer_object

This is more future-proof than the current code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
8 years agomesa/bufferobj: make _mesa_delete_buffer_object externally accessible
Nicolai Hähnle [Wed, 6 Jan 2016 02:47:04 +0000 (21:47 -0500)]
mesa/bufferobj: make _mesa_delete_buffer_object externally accessible

gl_buffer_object has grown more complicated and requires cleanup. Using this
function from drivers will be more future-proof.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agollvmpipe: use sse2 conv code for altivec
Oded Gabbay [Thu, 7 Jan 2016 17:50:12 +0000 (19:50 +0200)]
llvmpipe: use sse2 conv code for altivec

In lp_build_conv() and lp_build_conv_auto(), there is a special case of
conversion when sse2 is present. That code path is suitable without any
changes to altivec, because all the functions that are called in that
code path already support altivec.

This patch increase the FPS in POWER arch across the board
between 10%-25%

I checked ipers, glxgears, glxspheres64, openarena, xonotic and glmark2.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
8 years agoradeonsi: adjust the parameters of si_shader_dump
Marek Olšák [Wed, 6 Jan 2016 01:30:13 +0000 (02:30 +0100)]
radeonsi: adjust the parameters of si_shader_dump

The function will be extended to dump all binaries shaders will consist of,
so si_shader* makes sense here.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: move si_shader_dump call out of si_compile_llvm
Marek Olšák [Sun, 3 Jan 2016 16:18:04 +0000 (17:18 +0100)]
radeonsi: move si_shader_dump call out of si_compile_llvm

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: inline si_shader_binary_read
Marek Olšák [Sun, 3 Jan 2016 16:05:05 +0000 (17:05 +0100)]
radeonsi: inline si_shader_binary_read

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: move si_shader_dump call out of si_shader_binary_read
Marek Olšák [Sun, 3 Jan 2016 16:03:24 +0000 (17:03 +0100)]
radeonsi: move si_shader_dump call out of si_shader_binary_read

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: separate shader dumping code to si_shader_dump and *_dump_stats
Marek Olšák [Sun, 3 Jan 2016 15:39:24 +0000 (16:39 +0100)]
radeonsi: separate shader dumping code to si_shader_dump and *_dump_stats

Eventually, I'd like to dump stats for several combined binaries, which is
why you don't see a binary parameter in si_shader_dump_stats

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: add si_shader_destroy_binary
Marek Olšák [Sun, 27 Dec 2015 23:53:29 +0000 (00:53 +0100)]
radeonsi: add si_shader_destroy_binary

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: don't pass si_shader to si_compile_llvm
Marek Olšák [Mon, 28 Dec 2015 00:45:00 +0000 (01:45 +0100)]
radeonsi: don't pass si_shader to si_compile_llvm

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: move si_shader_binary_upload out of si_compile_llvm
Marek Olšák [Sun, 27 Dec 2015 22:47:00 +0000 (23:47 +0100)]
radeonsi: move si_shader_binary_upload out of si_compile_llvm

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: always keep shader code, rodata, and relocs in memory
Marek Olšák [Sun, 27 Dec 2015 22:35:08 +0000 (23:35 +0100)]
radeonsi: always keep shader code, rodata, and relocs in memory

We won't compile shaders in draw calls, but we will concatenate shader
binaries according to states in draw calls, so keep the binaries.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: don't pass si_shader to si_shader_binary_read
Marek Olšák [Mon, 28 Dec 2015 00:45:00 +0000 (01:45 +0100)]
radeonsi: don't pass si_shader to si_shader_binary_read

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: don't pass si_shader to si_shader_binary_read_config
Marek Olšák [Mon, 28 Dec 2015 00:45:00 +0000 (01:45 +0100)]
radeonsi: don't pass si_shader to si_shader_binary_read_config

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: add struct si_shader_config
Marek Olšák [Sun, 27 Dec 2015 23:14:05 +0000 (00:14 +0100)]
radeonsi: add struct si_shader_config

There will be 1 config per variant, which will be a union of configs
from {prolog, main, epilog}. For now, just add the structure.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: move NULL exporting into a separate function
Marek Olšák [Sun, 27 Dec 2015 19:05:19 +0000 (20:05 +0100)]
radeonsi: move NULL exporting into a separate function

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: move MRT color exporting into a separate function
Marek Olšák [Sun, 27 Dec 2015 19:02:41 +0000 (20:02 +0100)]
radeonsi: move MRT color exporting into a separate function

This will be used by a fragment shader epilog.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: use EXP_NULL for pixel shaders without outputs
Marek Olšák [Sun, 27 Dec 2015 18:36:33 +0000 (19:36 +0100)]
radeonsi: use EXP_NULL for pixel shaders without outputs

This never happens currently.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: only use LLVMBuildLoad once when updating color outputs at the end
Marek Olšák [Sun, 27 Dec 2015 16:53:44 +0000 (17:53 +0100)]
radeonsi: only use LLVMBuildLoad once when updating color outputs at the end

without LLVMBuildStore.

So:
- do LLVMBuildLoad
- update the values as necessary
- export

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: export "undef" values for undefined PS outputs
Marek Olšák [Sun, 27 Dec 2015 16:45:52 +0000 (17:45 +0100)]
radeonsi: export "undef" values for undefined PS outputs

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: move MRTZ export into a separate function
Marek Olšák [Sun, 27 Dec 2015 16:38:37 +0000 (17:38 +0100)]
radeonsi: move MRTZ export into a separate function

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: simplify setting the DONE bit for PS exports
Marek Olšák [Wed, 23 Dec 2015 17:06:04 +0000 (18:06 +0100)]
radeonsi: simplify setting the DONE bit for PS exports

First find out what the last export is and simply set the DONE bit there.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: set SPI color formats and CB_SHADER_MASK outside of compilation
Marek Olšák [Wed, 23 Dec 2015 15:43:54 +0000 (16:43 +0100)]
radeonsi: set SPI color formats and CB_SHADER_MASK outside of compilation

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: write all MRTs only if there is exactly one output
Marek Olšák [Wed, 23 Dec 2015 15:24:02 +0000 (16:24 +0100)]
radeonsi: write all MRTs only if there is exactly one output

This doesn't fix a known bug, but better safe than sorry.

Also, simplify the expression in si_shader.c.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: determine SPI_SHADER_Z_FORMAT outside of shader compilation
Marek Olšák [Wed, 23 Dec 2015 15:02:46 +0000 (16:02 +0100)]
radeonsi: determine SPI_SHADER_Z_FORMAT outside of shader compilation

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>