mesa.git
9 years agogallium/radeon: simplify restoring render condition after flush
Marek Olšák [Sat, 7 Nov 2015 14:00:55 +0000 (15:00 +0100)]
gallium/radeon: simplify restoring render condition after flush

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agogallium/radeon: don't use PREDICATION_OP_CLEAR
Marek Olšák [Sat, 7 Nov 2015 13:55:23 +0000 (14:55 +0100)]
gallium/radeon: don't use PREDICATION_OP_CLEAR

Not setting the predication bit is sufficient.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agogallium/radeon: simplify disabling render condition for u_blitter
Marek Olšák [Sat, 7 Nov 2015 13:45:58 +0000 (14:45 +0100)]
gallium/radeon: simplify disabling render condition for u_blitter

just disable it by not setting the predication bit

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agor600g: don't set predication on non-draw packets
Marek Olšák [Sat, 7 Nov 2015 13:36:38 +0000 (14:36 +0100)]
r600g: don't set predication on non-draw packets

This has no effect.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agogallium/radeon: inline the r600_rings structure
Marek Olšák [Sat, 7 Nov 2015 13:00:30 +0000 (14:00 +0100)]
gallium/radeon: inline the r600_rings structure

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agoradeonsi: prevent recursion in si_context_gfx_flush
Marek Olšák [Sat, 7 Nov 2015 11:22:56 +0000 (12:22 +0100)]
radeonsi: prevent recursion in si_context_gfx_flush

The recursion can only occur if you modify need_cs_space to always flush.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agogallium/radeon: remove the IB flushing flag
Marek Olšák [Sat, 7 Nov 2015 12:43:18 +0000 (13:43 +0100)]
gallium/radeon: remove the IB flushing flag

Not needed anymore. A similar flag will be introduced in the next commit,
which will be private in radeonsi.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agogallium/radeon: move GFX/DMA flushing from add_to_buffer_list to need_cs_space
Marek Olšák [Sat, 7 Nov 2015 12:31:03 +0000 (13:31 +0100)]
gallium/radeon: move GFX/DMA flushing from add_to_buffer_list to need_cs_space

need_cs_space isn't invoked so often and is called before all commands too.
This is a lot cleaner. The code in radeon_add_to_buffer_list always seemed
dodgy to me.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agoradeonsi: rename cache flushing flags once more
Marek Olšák [Fri, 6 Nov 2015 20:11:16 +0000 (21:11 +0100)]
radeonsi: rename cache flushing flags once more

KCACHE, TC L1 and TC L2 are renamed to:
- SMEM L1
- VMEM L1
- GLOBAL L2

You can easily tell what they are used for now.
Shaders must deal with coherency issues between both L1s manually,
e.g. by setting GLC=1 or by using s_dcache_*.

BOTH_ICACHE_KCACHE was an unused definition.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agoradeonsi: set the DISABLE_WR_CONFIRM flag on CI-VI as well
Marek Olšák [Sat, 7 Nov 2015 11:07:31 +0000 (12:07 +0100)]
radeonsi: set the DISABLE_WR_CONFIRM flag on CI-VI as well

I missed this in commit c3e527f93d4281ad6e2ca165eaf6ff588e4faefa
    radeonsi: only enable write confirmation on the last CP DMA packet

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agoradeonsi: initialize SX_PS_DOWNCONVERT to 0 on Stoney
Marek Olšák [Thu, 5 Nov 2015 22:56:38 +0000 (23:56 +0100)]
radeonsi: initialize SX_PS_DOWNCONVERT to 0 on Stoney

otherwise the SX or CB blocks can go bananas

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: mesa-stable@lists.freedesktop.org
9 years agoradeonsi: add glClearBufferSubData acceleration
Marek Olšák [Tue, 3 Nov 2015 18:35:46 +0000 (19:35 +0100)]
radeonsi: add glClearBufferSubData acceleration

8-bit and 16-bit clears which are not aligned to dwords are done in software.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agoradeonsi: add SI_SAVE_FRAGMENT_STATE blitter flag
Marek Olšák [Fri, 6 Nov 2015 22:16:11 +0000 (23:16 +0100)]
radeonsi: add SI_SAVE_FRAGMENT_STATE blitter flag

Buffer clears via transform feedback won't set this.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agogallium/u_blitter: add support for multi-dword clear values in clear_buffer
Marek Olšák [Fri, 6 Nov 2015 22:41:15 +0000 (23:41 +0100)]
gallium/u_blitter: add support for multi-dword clear values in clear_buffer

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agoradeonsi: fix a future crash in emit_cb_target_mask
Marek Olšák [Fri, 6 Nov 2015 22:42:49 +0000 (23:42 +0100)]
radeonsi: fix a future crash in emit_cb_target_mask

This can't crash currently, but it would crash if clear_buffer
from u_blitter were used with a clean context.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agoradeonsi: fix unaligned clear_buffer fallback
Marek Olšák [Fri, 6 Nov 2015 22:06:47 +0000 (23:06 +0100)]
radeonsi: fix unaligned clear_buffer fallback

This is unreachable currently, but it will be used by unaligned 8-bit and
16-bit fills.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agor600g: fix clear_buffer fallback with offset != 0
Marek Olšák [Thu, 5 Nov 2015 11:24:20 +0000 (12:24 +0100)]
r600g: fix clear_buffer fallback with offset != 0

Discovered by luck. This code path hasn't been exercised since transform
feedback was implemented.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
9 years agogallium/radeon: fix PIPE_QUERY_GPU_FINISHED
Marek Olšák [Sat, 7 Nov 2015 18:31:55 +0000 (19:31 +0100)]
gallium/radeon: fix PIPE_QUERY_GPU_FINISHED

Broken by the addition of r600_multi_fence
in 3b37155a68acc351cba86a1fa142bd0de2192d4c

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89014

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agomesa: minor comment fix in blend.c
Brian Paul [Fri, 13 Nov 2015 15:02:05 +0000 (08:02 -0700)]
mesa: minor comment fix in blend.c

9 years agodocs: add link to Coverity on developer utilities page
Brian Paul [Fri, 13 Nov 2015 15:01:29 +0000 (08:01 -0700)]
docs: add link to Coverity on developer utilities page

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agodocs: update VMware driver instructions
Brian Paul [Fri, 13 Nov 2015 14:59:42 +0000 (07:59 -0700)]
docs: update VMware driver instructions

Use a LIBDIR variable, set per-platform.
Update the Mesa configuration flags.
Run update-initramfs or dracut, update /etc/modules

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agoegl/wayland: Ignore rects from SwapBuffersWithDamage
Daniel Stone [Sat, 7 Nov 2015 18:25:31 +0000 (18:25 +0000)]
egl/wayland: Ignore rects from SwapBuffersWithDamage

eglSwapBuffersWithDamage accepts damage-region rectangles to hint the
compositor that it only needs to redraw certain areas, which was passed
through the wl_surface_damage request, as designed.

Wayland also offers a buffer transformation interface, e.g. to allow
users to render pre-rotated buffers. Unfortunately, there is no way to
query buffer transforms, and the damage region was provided in surface,
rather than buffer, co-ordinate space.

Users could in theory account for this themselves, but EGL also requires
co-ordinates to be passed in GL/mathematical co-ordinate space, with an
inversion to Wayland's natural/scanout co-ordinate space, so
transformations other than a 180-degree rotation will fail as EGL
attempts to subtract the region from (its view of the) surface height.

Pending creation and acceptance of a wl_surface.buffer_damage request,
which will accept co-ordinates in buffer co-ordinate space, pessimise to
always sending full-surface damage.

bce64c6c provides the explanation for why we send maximum-range damage,
rather than the full size of the surface: in the presence of buffer
transformations, full-surface damage may not actually cover the entire
surface.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agoRevert "nir/copy_propagate: do not copy-propagate MOV srcs with source modifiers"
Iago Toral Quiroga [Fri, 13 Nov 2015 07:51:06 +0000 (08:51 +0100)]
Revert "nir/copy_propagate: do not copy-propagate MOV srcs with source modifiers"

The change proposed in the review leads to piglit regressions because
is_move() is used in other places and relies on the checks for source
modifiers to be there.

Revert this until we agree on a better solution.

9 years agoglsl: fix 'shared' layout qualifier related regressions
Samuel Iglesias Gonsálvez [Thu, 12 Nov 2015 15:14:07 +0000 (16:14 +0100)]
glsl: fix 'shared' layout qualifier related regressions

Commit 8b28b35 added 'shared' as a keyword for compute shaders
but it broke the existing 'shared' layout qualifier support for
uniform and shader storage blocks.

This patch fixes 578 dEQP-GLES31.functional.ssbo.* tests.

v2:
- Move SHARED to interface_block_layout_qualifier (Timothy)
- Don't remove "shared" case insensitive check (Timothy)
- Remove the clearing of shared_storage flag (Timothy)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
9 years agonir/copy_propagate: do not copy-propagate MOV srcs with source modifiers
Iago Toral Quiroga [Fri, 6 Nov 2015 11:08:49 +0000 (12:08 +0100)]
nir/copy_propagate: do not copy-propagate MOV srcs with source modifiers

If a source operand in a MOV has source modifiers, then we cannot
copy-propagate it from the parent instruction and remove the MOV.

v2: remove the check for source source modifiers from is_move() (Jason)

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agonir/vars_to_ssa: Delete dead output set code
Jason Ekstrand [Fri, 13 Nov 2015 05:52:37 +0000 (21:52 -0800)]
nir/vars_to_ssa: Delete dead output set code

This was a remnant of an early attempt to handle output reads in
vars_to_ssa.  That attempt was abandon a long time ago but these few lines
were aparently left in the pass and managed to evade review.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
9 years agonir/vars_to_ssa: Rework copy set handling in lower_copies_to_load_store
Jason Ekstrand [Fri, 13 Nov 2015 02:10:22 +0000 (18:10 -0800)]
nir/vars_to_ssa: Rework copy set handling in lower_copies_to_load_store

Previously, we walked through a given deref_node's copies and, after
lowering the copy away, removed it from both the source and destination
copy sets.  This commit changes this to only remove it from the other
node's copy set (not the one we're lowering).  At the end of the loop, we
just throw away the copy set for the node we're lowering since that node no
longer has any copies.  This has two advantages:

 1) It's more efficient because we're doing potentially half as many set
    search operations.

 2) It now properly handles copies from a node to itself.  Perviously, it
    would delete the copy from the set when processing the destinatioon and
    then assert-fail when we couldn't find it for the source.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92588
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
9 years agonir/validate: Allow subroutine types for the tails of derefs
Jason Ekstrand [Tue, 10 Nov 2015 22:13:47 +0000 (14:13 -0800)]
nir/validate: Allow subroutine types for the tails of derefs

The shader-subroutine code creates uniforms of type SUBROUTINE for
subroutines that are then read as integers in the backends.  If we ever
want to do any optimizations on these, we'll need to come up with a better
plan where they are actual scalars or something, but this works for now.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92859
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
9 years agomesa: Replace gl_extensions::EXT_texture3D with ::dummy_true
Nanley Chery [Fri, 16 Oct 2015 17:14:39 +0000 (10:14 -0700)]
mesa: Replace gl_extensions::EXT_texture3D with ::dummy_true

Mesa unconditionally sets this driver flag to true in
_mesa_init_extensions(). There is therefore no need for
the driver to communicate support for this extension.
Replace the driver capability flag with ::dummy_true.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
9 years agomesa: fix MSVC build break in extensions.h
Brian Paul [Thu, 12 Nov 2015 22:59:21 +0000 (15:59 -0700)]
mesa: fix MSVC build break in extensions.h

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
9 years agonvc0/ir: add support for TGSI_SEMANTIC_HELPER_INVOCATION
Ilia Mirkin [Mon, 14 Sep 2015 20:23:29 +0000 (16:23 -0400)]
nvc0/ir: add support for TGSI_SEMANTIC_HELPER_INVOCATION

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agogallium: add support for gl_HelperInvocation semantic
Ilia Mirkin [Mon, 14 Sep 2015 20:23:04 +0000 (16:23 -0400)]
gallium: add support for gl_HelperInvocation semantic

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
9 years agoglsl: add gl_HelperInvocation system value
Ilia Mirkin [Mon, 14 Sep 2015 20:13:43 +0000 (16:13 -0400)]
glsl: add gl_HelperInvocation system value

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agoglsl: Correctly handle vector extract on function parameter
Jordan Justen [Thu, 12 Nov 2015 06:02:06 +0000 (22:02 -0800)]
glsl: Correctly handle vector extract on function parameter

This commit accidentally used a '==' when '=' was intended.

commit 96b22fb080894ba1840af2372f28a46cc0f40c76
Author: Kristian Høgsberg Kristensen <krh@bitplanet.net>
Date:   Wed Nov 4 14:58:54 2015 -0800

    glsl: Use array deref for access to vector components

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
9 years agomesa: In helpers, only check driver capability for meta
Nanley Chery [Thu, 15 Oct 2015 19:34:43 +0000 (12:34 -0700)]
mesa: In helpers, only check driver capability for meta

Make API context and version checks done by the helper functions pass
unconditionally while meta is in progress. This transparently makes
extension checks solely dependent on struct gl_extensions while in meta.

v2: Use an 8-bit data type instead of a GLuint

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
9 years agomesa/extensions: Prefix global struct and extension type
Nanley Chery [Mon, 26 Oct 2015 22:22:24 +0000 (15:22 -0700)]
mesa/extensions: Prefix global struct and extension type

Rename the following types and variables:
* struct extension -> struct mesa_extension,
  like the mesa_format type.
* extension_table -> _mesa_extension_table,
  like the _mesa_extension_override_{enables,disables} structs.

Suggested-by: Marek Olšák <marek.olsak@amd.com>
Suggested-by: Chad Versace <chad.versace@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
9 years agomesa: Generate a helper function for each extension
Nanley Chery [Thu, 17 Sep 2015 22:49:40 +0000 (15:49 -0700)]
mesa: Generate a helper function for each extension

Generate functions which determine if an extension is supported in the
current context. Initially, enums were going to be explicitly used with
_mesa_extension_supported(). The idea to embed the function and enums
into generated helper functions was suggested by Kristian Høgsberg.

For performance, the function body no longer uses
_mesa_extension_supported() and, as suggested by Chad Versace, the
functions are also declared static inline.

v2: Place function qualifiers on separate line (Chad)
v3: Move function curly brace to new line (Chad)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
9 years agomesa/extensions: Replace extension::api_set with ::version
Nanley Chery [Mon, 21 Sep 2015 18:23:33 +0000 (11:23 -0700)]
mesa/extensions: Replace extension::api_set with ::version

The api_set field has no users outside of _mesa_extension_supported().
Remove it and allow the version field to take its place.

The brunt of the transformation was performed with the following vim commands:
s/\(GL [^,]\+\),\s*\d*,\s*\d*\(,\s*\d*\)\(,\s*\d*\)/\1, GLL, GLC\2\3/g
s/\(GLL [^,]\+\)\,\s*\d*/\1, GLL/g
s/\(GLC [^,]\+\)\(,\s*\d*\),\s*\d*\(,\s*\d*\)\(,\s*\d*\)/\1\2, GLC\3\4/g
s/\( ES1[^,]*\)\(,\s*\(\w\|\d\)\+\)\(,\s*\(\w\|\d\)\+\),\s*\d*/\1\2\4, ES1/g
s/\( ES2[^,]*\)\(,\s*\(\w\|\d\)\+\)\(,\s*\(\w\|\d\)\+\)\(,\s*\(\w\|\d\)\+\),\s*\d*/\1\2\4\6, ES2/g

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
9 years agomesa/extensions: Use _mesa_extension_supported()
Nanley Chery [Tue, 8 Sep 2015 19:41:18 +0000 (12:41 -0700)]
mesa/extensions: Use _mesa_extension_supported()

Replace open-coded checks for extension support with
_mesa_extension_supported().

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
9 years agomesa/extensions: Create _mesa_extension_supported()
Nanley Chery [Wed, 2 Sep 2015 18:53:16 +0000 (11:53 -0700)]
mesa/extensions: Create _mesa_extension_supported()

Create a function which determines if an extension is supported in the
current context.

v2: Use common variable names (Emil)
    Insert new line between variables and return statement (Chad)
    Rename api_set variable to api_bit (Chad)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
9 years agomesa/extensions: Add extension::version
Nanley Chery [Tue, 8 Sep 2015 19:25:56 +0000 (12:25 -0700)]
mesa/extensions: Add extension::version

Enable limiting advertised extension support by context version with
finer granularity. This new field is currently unused and is set to
0 everywhere. When it is used, a value of 0 will indicate that the
extension is supported for any version of a context.

v2: Use uint*t type for version and note the expected values (Emil)
    Use an 8-bit data type
    Reformat macro for better readability (Chad)

v3: Note preparatory nature of commit (Chad)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
9 years agomesa/extensions: Move entries entries to separate file
Nanley Chery [Wed, 16 Sep 2015 18:27:38 +0000 (11:27 -0700)]
mesa/extensions: Move entries entries to separate file

With this infrastructure set in place, we can now reuse the entries to
generate useful code.

v2: Add the new file into Makefile.sources (Emil)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
9 years agomesa/extensions: Wrap array entries in macros
Nanley Chery [Wed, 2 Sep 2015 18:26:57 +0000 (11:26 -0700)]
mesa/extensions: Wrap array entries in macros

Now that we're using macros, remove the redundant text from each entry.

Remove comments between the entries to make editing easier and separate
the sections with blank lines. Structure the EXT macros in a way that
helps reviewers verify that no meaning has been altered.

v2: Indent the entries (Chad)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
9 years agomesa/extensions: Remove array sentinel
Nanley Chery [Fri, 11 Sep 2015 16:59:32 +0000 (09:59 -0700)]
mesa/extensions: Remove array sentinel

Simplify future updates to the extension struct array by removing
the sentinel.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
9 years agoi965: Check instructions appear only on supported hardware.
Matt Turner [Mon, 29 Jun 2015 22:59:37 +0000 (15:59 -0700)]
i965: Check instructions appear only on supported hardware.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965: Add initial assembly validation pass.
Matt Turner [Mon, 29 Jun 2015 21:08:51 +0000 (14:08 -0700)]
i965: Add initial assembly validation pass.

Initially just checks that sources are non-NULL, which would have
alerted us to the problem fixed by commit 6c846dc5.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965: Add annotation_insert_error() and support for printing errors.
Matt Turner [Wed, 21 Oct 2015 22:23:10 +0000 (15:23 -0700)]
i965: Add annotation_insert_error() and support for printing errors.

Will allow annotations to contain error messages (indicating an
instruction violates a rule for instance) that are printed after the
disassembly of the block.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965: Combine assembly annotations if possible.
Matt Turner [Thu, 8 Oct 2015 04:04:48 +0000 (21:04 -0700)]
i965: Combine assembly annotations if possible.

Often annotations are identical between sets of consecutive
instructions. We can perhaps avoid some memory allocations by reusing
the previous annotation.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965: Set annotation_info's mem_ctx.
Matt Turner [Mon, 29 Jun 2015 21:05:27 +0000 (14:05 -0700)]
i965: Set annotation_info's mem_ctx.

It was being memset to 0 previously.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
9 years agoi965: Don't consider control flow instructions to have sources.
Matt Turner [Thu, 15 Oct 2015 18:38:43 +0000 (11:38 -0700)]
i965: Don't consider control flow instructions to have sources.

And why did IFF have a destination?

I suspect that once upon a time the disassembler used this information
to know which fields to find the jump targets in. The jump targets have
moved, so the disassembler has to know how to handle these
per-generation anyway.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965: Fill out instruction list.
Matt Turner [Mon, 29 Jun 2015 21:03:55 +0000 (14:03 -0700)]
i965: Fill out instruction list.

Add some instructions: illegal, movi, sends, sendsc.

Remove some instructions with reused opcodes: msave, mrestore, push,
pop, goto. I did have some gross code for disassembling opcodes
per-generation, but there's very little meaningful overlap so it's
probably not needed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoralloc: Set *start in ralloc_vasprintf_rewrite_tail() if str is NULL.
Matt Turner [Mon, 29 Jun 2015 22:05:19 +0000 (15:05 -0700)]
ralloc: Set *start in ralloc_vasprintf_rewrite_tail() if str is NULL.

We were leaving it undefined, even though we were writing a string to
*str.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965: Consolidate is_3src() functions.
Matt Turner [Thu, 8 Oct 2015 21:19:10 +0000 (14:19 -0700)]
i965: Consolidate is_3src() functions.

Otherwise I'll have to add another later in this series.

9 years agost/wgl: add a comment about recursive locking in stw_make_current()
Brian Paul [Tue, 10 Nov 2015 21:49:17 +0000 (14:49 -0700)]
st/wgl: add a comment about recursive locking in stw_make_current()

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
9 years agost/wgl: add a lock assertion in stw_framebuffer_from_hwnd_locked()
Brian Paul [Tue, 10 Nov 2015 21:49:01 +0000 (14:49 -0700)]
st/wgl: add a lock assertion in stw_framebuffer_from_hwnd_locked()

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
9 years agost/wgl: add some mutex checking code
José Fonseca [Tue, 10 Nov 2015 21:41:30 +0000 (14:41 -0700)]
st/wgl: add some mutex checking code

This would have caught the locking bug that was fixed in the earlier
"st/wgl: fix locking issue in stw_st_framebuffer_present_locked()"
patch.

v2: minor coding style changes by Brian.

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
9 years agost/wgl: rename stw_framebuffer_release() to stw_framebuffer_unlock()
Brian Paul [Tue, 10 Nov 2015 21:51:26 +0000 (14:51 -0700)]
st/wgl: rename stw_framebuffer_release() to stw_framebuffer_unlock()

To match the new stw_framebuffer_lock() function.

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
9 years agost/wgl: reimplement stw_framebuffer::mutex with CRITICAL_SECTION
Brian Paul [Tue, 10 Nov 2015 21:38:25 +0000 (14:38 -0700)]
st/wgl: reimplement stw_framebuffer::mutex with CRITICAL_SECTION

v2: update comments on the stw_framebuffer::mutex field regarding locking
order.

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
9 years agost/wgl: include u_debug.h
Brian Paul [Tue, 10 Nov 2015 21:34:51 +0000 (14:34 -0700)]
st/wgl: include u_debug.h

To get declaration for debug_printf() directly instead of getting it
indirectly through os_thread.h

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
9 years agost/wgl: reimplement stw_device::fb_mutex with CRITICAL_SECTION
Brian Paul [Tue, 10 Nov 2015 21:24:18 +0000 (14:24 -0700)]
st/wgl: reimplement stw_device::fb_mutex with CRITICAL_SECTION

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
9 years agost/wgl: re-implement stw_device::ctx_mutex with CRITICAL_SECTION
Brian Paul [Tue, 10 Nov 2015 21:10:45 +0000 (14:10 -0700)]
st/wgl: re-implement stw_device::ctx_mutex with CRITICAL_SECTION

This is Windows-only code so we can use the native Win32 functions for
critical sections.  This will also allow us to (cleanly) add some mutex
check/debug code in subsequent patches.

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
9 years agogallium/hud: add cpu graph support for Windows
Brian Paul [Thu, 12 Nov 2015 16:06:25 +0000 (09:06 -0700)]
gallium/hud: add cpu graph support for Windows

We support "cpu" but not "cpu#" because there's no good way of querying
per-cpu usage.  Also, the cpu usage is for the process, not the whole
system.

Original code cobbled together by Brian and then fixed/polished by Jose.

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agoglsl: set matrix_stride for non matrices with atomic counter buffers
Tapani Pälli [Mon, 2 Nov 2015 11:36:19 +0000 (13:36 +0200)]
glsl: set matrix_stride for non matrices with atomic counter buffers

Patch sets matrix_stride as 0 for non matrix uniforms that are in a
atomic counter buffer. Matrix stride calculation for actual matrix
uniforms is done during link_assign_uniform_locations.

From ARB_program_interface_query specification:

GL_MATRIX_STRIDE:

   "For active variables not declared as a matrix or array of matrices,
   zero is written to <params>.  For active variables not backed by a
   buffer object, -1 is written to <params>, regardless of the variable
   type."

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
9 years agomesa: validate precision of varyings during ValidateProgramPipeline
Tapani Pälli [Thu, 5 Nov 2015 10:52:26 +0000 (12:52 +0200)]
mesa: validate precision of varyings during ValidateProgramPipeline

Fixes following failing ES3.1 CTS tests:

   ES31-CTS.sepshaderobjs.InterfacePrecisionMatchingFloat
   ES31-CTS.sepshaderobjs.InterfacePrecisionMatchingInt
   ES31-CTS.sepshaderobjs.InterfacePrecisionMatchingUInt

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
9 years agoglsl: do not lose precision information when packing varyings
Tapani Pälli [Thu, 5 Nov 2015 10:23:17 +0000 (12:23 +0200)]
glsl: do not lose precision information when packing varyings

This information will be used by cross stage validation of varyings
for pipeline objects.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
9 years agoglsl: Add precision information to ir_variable
Iago Toral Quiroga [Tue, 10 Nov 2015 06:22:07 +0000 (08:22 +0200)]
glsl: Add precision information to ir_variable

We will need this later on when we implement proper support for
precision qualifiers in the drivers and also to do link time checks for
uniforms as indicated by the spec.

This patch also adds compile-time checks for variables without precision
information (currently, Mesa only checks that a default precision is set
for floats in fragment shaders).

As indicated by Ian, the addition of the precision information to
ir_variable has been done using a bitfield and pahole to identify an
available hole so that memory requirements for ir_variable stay the
same.

v2 (Ian):
  - Avoid if-ladders by defining arrays of supported sampler names and
    indexing
    into them with type->sampler_array + 2 * type->sampler_shadow
  - Make the code that selects the precision qualifier to use an utility
    function
  - Fix a typo

v3 (Tapani):
  - rebased
  - squashed in "Precision qualifiers are not allowed on structs"
  - fixed select_gles_precision for sampler arrays
  - fixed precision_qualifier_allowed for arrays of structs

v4 (Tapani):
  - add atomic_uint handling
  - do not allow precision qualifier on images
  (issues reported by Marta)

v5 (Tapani):
  - support precision qualifier on image types

v6 (Tapani):
  - set precision qualifier on interface block members

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
9 years agoglsl: Move the definition of precision_qualifier_allowed
Iago Toral Quiroga [Thu, 5 Nov 2015 06:18:46 +0000 (08:18 +0200)]
glsl: Move the definition of precision_qualifier_allowed

We will need this to build later patches

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
9 years agoglsl: Add user-defined default precision qualifiers to the symbol table
Iago Toral Quiroga [Thu, 26 Feb 2015 11:15:18 +0000 (12:15 +0100)]
glsl: Add user-defined default precision qualifiers to the symbol table

Notice that the spec requires that a default precision has been set for every
type used by a shader that can use a precision qualifier and does not have a
predefined precision, however, at the moment, Mesa only checks this for floats
in the fragment shader. This is probably because the GLSL ES 1.0 specs mentions
this case specifically, but GLSL ES 3.0 clarifies that the same applies to
other types:

"The fragment language has no default precision qualifier for floating point
 types. Hence for float, floating point vector and matrix variable
 declarations, either the declaration must include a precision qualifier or
 the default float precision must have been previously declared. Similarly,
 there is no default precision qualifier for the following sampler types in
 either the vertex or fragment language:

 sampler3D;
 samplerCubeShadow;
 sampler2DShadow;
 sampler2DArray;
 sampler2DArrayShadow;
 isampler2D;
 isampler3D;
 isamplerCube;
 isampler2DArray;
 usampler2D;
 usampler3D;
 usamplerCube;
 usampler2DArray;"

we will fix this in a later patch.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
9 years agoglsl: Add default precision qualifiers to the symbol table
Iago Toral Quiroga [Thu, 26 Feb 2015 11:15:17 +0000 (12:15 +0100)]
glsl: Add default precision qualifiers to the symbol table

The GLSL ES spec specifies default precision qualifiers for certain types,
so populate the symbol table with these.

Notice that the desktop GLSL spec also indicates defaults for some types
but this is not really useful since precision qualifiers are completely
ignored in desktop GLSL.

v2: simplify and add samplerExternalOES, specified by
    OES_EGL_image_external (Tapani)

v3: add atomic_uint (reported missing by Marta)

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
9 years agoglsl: Add API to put default precision qualifiers in the symbol table
Iago Toral Quiroga [Thu, 26 Feb 2015 11:15:16 +0000 (12:15 +0100)]
glsl: Add API to put default precision qualifiers in the symbol table

These have scoping rules that match the ones defined for other things such
as variables, so we want them in the symbol table.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
9 years agoi965/fs/nir: fix the number of register written by FS_OPCODE_GET_BUFFER_SIZE
Samuel Iglesias Gonsálvez [Tue, 10 Nov 2015 12:45:21 +0000 (13:45 +0100)]
i965/fs/nir: fix the number of register written by FS_OPCODE_GET_BUFFER_SIZE

FS_OPCODE_GET_BUFFER_SIZE is calculated with a resinfo's sampler message.

This patch adjusts the number of registers written by the opcode
following what the PRM spec says about the number of registers written
by the SIMD8 and SIMD16's writeback messages for sampler messages.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agoi965/skl/gt4: Fix URB programming restriction.
Ben Widawsky [Sat, 7 Nov 2015 02:12:27 +0000 (18:12 -0800)]
i965/skl/gt4: Fix URB programming restriction.

The comment in the code details the restriction. Thanks to Ken for having a very
helpful conversation with me, and spotting the blurb in the link I sent him :P.

There are still stability problems for me on GT4, but this definitely helps with
some of the failures.

v2: Comment fixes

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonv50,nvc0: add ARB_clear_texture support
Ilia Mirkin [Mon, 9 Nov 2015 17:39:05 +0000 (12:39 -0500)]
nv50,nvc0: add ARB_clear_texture support

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agost/mesa: implement ARB_clear_texture
Ilia Mirkin [Wed, 5 Mar 2014 02:51:55 +0000 (21:51 -0500)]
st/mesa: implement ARB_clear_texture

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
9 years agogallium: add PIPE_CAP_CLEAR_TEXTURE and clear_texture prototype
Ilia Mirkin [Mon, 9 Nov 2015 18:27:07 +0000 (13:27 -0500)]
gallium: add PIPE_CAP_CLEAR_TEXTURE and clear_texture prototype

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
9 years agoglsl: add helper to check for enhanced layouts support
Timothy Arceri [Tue, 27 Oct 2015 20:42:49 +0000 (07:42 +1100)]
glsl: add helper to check for enhanced layouts support

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
9 years agomesa: add ARB_enhanced_layouts
Timothy Arceri [Sun, 4 Oct 2015 13:01:45 +0000 (00:01 +1100)]
mesa: add ARB_enhanced_layouts

Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
9 years agor600: initialised PGM_RESOURCES_2 for ES/GS
Dave Airlie [Wed, 11 Nov 2015 22:34:18 +0000 (08:34 +1000)]
r600: initialised PGM_RESOURCES_2 for ES/GS

This fixes the corruption on rendering that we are seeing in
certain geometry shaders.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=91780
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested / Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Cc: "10.6" "11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agoi965: Split nir_emit_intrinsic by stage with a general fallback.
Kenneth Graunke [Thu, 5 Nov 2015 07:05:07 +0000 (23:05 -0800)]
i965: Split nir_emit_intrinsic by stage with a general fallback.

Many intrinsics only apply to a particular stage (such as discard).
In other cases, we may want to interpret them differently based on
the stage (such as load_primitive_id or load_input).

The current method isn't that pretty - we handle all intrinsics in
one giant function.  Sometimes we assert on stage, sometimes we forget.
Different behaviors are handled via if-ladders based on stage.

This commit introduces new nir_emit_<stage>_intrinsic() functions,
and makes nir_emit_instr() call those.  In turn, those fall back to
the generic nir_emit_intrinsic() function for cases they don't want
to handle specially.

This makes it clear which intrinsics only exist in one stage, and makes
it easy to handle inputs/outputs differently for various stages.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
9 years agomesa/copyimage: allow width/height to not be multiples of block
Ilia Mirkin [Sun, 8 Nov 2015 09:46:38 +0000 (04:46 -0500)]
mesa/copyimage: allow width/height to not be multiples of block

For compressed textures, the image size is not necessarily a multiple of
the block size (e.g. the last mip levels). Section 18.3.2 (Copying
Between Images) of the OpenGL 4.5 Core Profile spec says:

    An INVALID_VALUE error is generated if the dimensions of either
    subregion exceeds the boundaries of the corresponding image
    object, or if the image format is compressed and the dimensions of
    the subregion fail to meet the alignment constraints of the
    format.

and Section 8.7 (Compressed Texture Images) says:

    An INVALID_OPERATION error is generated if any of the following
    conditions occurs:

      * width is not a multiple of four, and width + xoffset is not
        equal to the value of TEXTURE_WIDTH.
      * height is not a multiple of four, and height + yoffset is not
        equal to the value of TEXTURE_HEIGHT.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92860
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
9 years agoi965/brw_reg: Add a brw_VxH_indirect helper
Jason Ekstrand [Thu, 20 Aug 2015 05:15:33 +0000 (22:15 -0700)]
i965/brw_reg: Add a brw_VxH_indirect helper

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agomesa: remove old comments in arrayobj.c
Brian Paul [Wed, 11 Nov 2015 00:03:37 +0000 (17:03 -0700)]
mesa: remove old comments in arrayobj.c

9 years agost/wgl: clarify code in stw_framebuffer_from_hwnd_locked()
Brian Paul [Tue, 10 Nov 2015 00:25:22 +0000 (17:25 -0700)]
st/wgl: clarify code in stw_framebuffer_from_hwnd_locked()

Just a minor code change to make it obvious that NULL is returned when
we don't find the given HWND.

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
9 years agost/wgl: improve some function comments
Brian Paul [Tue, 10 Nov 2015 00:35:55 +0000 (17:35 -0700)]
st/wgl: improve some function comments

In particular, explain when stw_framebuffer objects are
locked/unlocked/etc.

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
9 years agost/wgl: whitespace/formatting fixes
Brian Paul [Tue, 10 Nov 2015 00:19:35 +0000 (17:19 -0700)]
st/wgl: whitespace/formatting fixes

9 years agost/wgl: fix locking issue in stw_st_framebuffer_present_locked()
Brian Paul [Mon, 9 Nov 2015 21:51:56 +0000 (14:51 -0700)]
st/wgl: fix locking issue in stw_st_framebuffer_present_locked()

When stw_st_framebuffer_present_locked() is called, the
stw_framebuffer's mutex will already be locked.  Normally, the
stw_framebuffer_present_locked() function calls
stw_framebuffer_release() to unlock the mutex when it's done.  But if
for some reason the 'resource' pointer in
stw_st_framebuffer_present_locked() is null, we'd return without
unlocking the stw_framebuffer.  This fixes that to avoid potential
deadlocks.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
9 years agoi965: Print force_writemask_all in dump_instructions().
Kenneth Graunke [Tue, 10 Nov 2015 07:55:58 +0000 (23:55 -0800)]
i965: Print force_writemask_all in dump_instructions().

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agoi965: Combine BRW_NEW_*_BINDING_TABLE dirty bits.
Kenneth Graunke [Tue, 25 Nov 2014 10:59:28 +0000 (02:59 -0800)]
i965: Combine BRW_NEW_*_BINDING_TABLE dirty bits.

A while back, we moved to directly emitting the Gen7+ state when
constructing the binding tables.  These flags are only used on
Gen4-6, which emit all the binding table pointers at once.

We gain nothing by having separate flags, so combine them.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
9 years agoi965: Map GL_PATCHES to 3DPRIM_PATCHLIST_n.
Kenneth Graunke [Sat, 25 Jul 2015 04:15:35 +0000 (21:15 -0700)]
i965: Map GL_PATCHES to 3DPRIM_PATCHLIST_n.

Inspired by a patch by Fabian Bieler.

Fabian defined a _3DPRIM_PATCHLIST_0 macro (which isn't actually a valid
topology type); I instead chose to make a macro that takes an argument.
He also took the number of patch vertices from _mesa_prim (which was set
to ctx->TessCtrlProgram.patch_vertices) - I chose to use it directly to
avoid the need for the VBO patch.

v2: Change macro to 0x20 + (n - 1) instead of 0x1F + n to better match
    the documentation (suggested by Ian).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agodocs: add news item and link release notes for 11.0.5
Emil Velikov [Wed, 11 Nov 2015 11:18:27 +0000 (11:18 +0000)]
docs: add news item and link release notes for 11.0.5

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
9 years agodocs: add sha256 checksums for 11.0.5
Emil Velikov [Wed, 11 Nov 2015 11:10:30 +0000 (11:10 +0000)]
docs: add sha256 checksums for 11.0.5

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 66c949d0a19b1e601243be22b6506528b866388b)

9 years agodocs: add release notes for 11.0.5
Emil Velikov [Wed, 11 Nov 2015 10:05:57 +0000 (10:05 +0000)]
docs: add release notes for 11.0.5

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit ee57c22141c42d9b511a7dfa5971c4428cd1c6e7)

9 years agor600g: Pass conservative depth parameters to hw
Glenn Kennard [Sat, 17 Oct 2015 14:53:28 +0000 (16:53 +0200)]
r600g: Pass conservative depth parameters to hw

Supported on R700 and up.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agoRevert "r600g: Pass conservative depth parameters to hw"
Dave Airlie [Tue, 10 Nov 2015 23:05:50 +0000 (09:05 +1000)]
Revert "r600g: Pass conservative depth parameters to hw"

This reverts commit a1fc78911e9a6439db94d6ae91d5672c76e5fb1c.

I pushed the wrong patch.

9 years agor600g: Implement ARB_texture_view
Glenn Kennard [Thu, 15 Oct 2015 23:53:47 +0000 (01:53 +0200)]
r600g: Implement ARB_texture_view

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agor600g: Pass conservative depth parameters to hw
Glenn Kennard [Fri, 16 Oct 2015 22:52:39 +0000 (00:52 +0200)]
r600g: Pass conservative depth parameters to hw

Supported on R700 and up.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
9 years agoi965/nir/opt_peephole_ffma: Bypass fusion if any operand of fadd and fmul is a const
Eduardo Lima Mitev [Thu, 22 Oct 2015 13:32:13 +0000 (15:32 +0200)]
i965/nir/opt_peephole_ffma: Bypass fusion if any operand of fadd and fmul is a const

When both fadd and fmul instructions have at least one operand that is a
constant and it is only used once, the total number of instructions can
be reduced from 3 (1 ffma + 2 load_const) to 2 (1 fmul + 1 fadd); because
the constants will be progagated as immediate operands of fmul and fadd.

This patch detects these situations and prevents fusing fmul+fadd into ffma.

Shader-db results on i965 Haswell:

total instructions in shared programs: 6235835 -> 6225895 (-0.16%)
instructions in affected programs:     1124094 -> 1114154 (-0.88%)
total loops in shared programs:        1979 -> 1979 (0.00%)
helped:                                7612
HURT:                                  843
GAINED:                                4
LOST:                                  0

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agoutil: Add list_is_singular() helper function
Eduardo Lima Mitev [Fri, 23 Oct 2015 14:31:41 +0000 (16:31 +0200)]
util: Add list_is_singular() helper function

Returns whether the list has exactly one element.

Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agonir/nir_opt_peephole_ffma: Move this lowering pass to the i965 driver
Eduardo Lima Mitev [Thu, 22 Oct 2015 13:25:23 +0000 (15:25 +0200)]
nir/nir_opt_peephole_ffma: Move this lowering pass to the i965 driver

Because the next patch will add an optimization that is specific to i965,
we want to move this loweing pass to that driver altogether.

This is safe because i965 is the only consumer.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agoglsl: Use array deref for access to vector components
Kristian Høgsberg Kristensen [Wed, 4 Nov 2015 22:58:54 +0000 (14:58 -0800)]
glsl: Use array deref for access to vector components

We've assumed that we could lower per-component vector access from

  vec[i] = scalar

to

  vec = ir_triop_vector_insert(vec, scalar, i)

but with SSBOs (and compute shader SLM and tesselation outputs) this is
no longer valid. If a vector is "externally visible", multiple threads
can write independent components simultaneously. With lowering to
ir_triop_vector_insert, each thread read the entire vector, changes one
component, then writes out the entire vector. This is racy.

Instead of generating a ir_binop_vector_extract when we see v[i], we
generate ir_dereference_array. We then add a lowering pass to lower the
ir_dereference_array to ir_binop_vector_extract for rvalues and for to
vector_insert for lvalues in a separate lowering pass.

The resulting IR is the same as before, but we now have a window between
ast->ir conversion and the lowering pass where v[i] appears in the IR as
an array deref. This lets us run lowering passes that lower the vector
access to I/O (eg for SSBO load/store) before we lower the per-component
access to full vector writes.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>