mesa.git
9 years agovc4: convert from tgsi semantic/index to varying-slot
Eric Anholt [Wed, 9 Sep 2015 17:23:55 +0000 (13:23 -0400)]
vc4: convert from tgsi semantic/index to varying-slot

(originally part of previous patch, split out to separate patch by Rob)

v2: squash in some fixes from Eric
v3: Another fix from Eric for point coords.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agogallium/ttn: Convert to using VARYING_SLOT_* / FRAG_RESULT_*.
Eric Anholt [Tue, 4 Aug 2015 21:28:02 +0000 (14:28 -0700)]
gallium/ttn: Convert to using VARYING_SLOT_* / FRAG_RESULT_*.

This avoids exceeding the size of the .index bitfield since it got
truncated, and should make our NIR look more like the NIR that the rest of
the NIR developers are working on.

v2: split out vc4 updates, first patch uses varying_slot_to_tgsi_semantic()
    helper, and second patch does the actual conversion.
v3: add frag_result_to_tgsi_semantic() helper and don't try to map
    frag_results to semantic name/index as if they were varying_slot's
v4: use VERT_ATTRIB_ for VS inputs
v5: Fix vc4 build.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agonv50, nvc0: fix max texture buffer size to 128M elements
Ilia Mirkin [Tue, 15 Sep 2015 23:39:25 +0000 (19:39 -0400)]
nv50, nvc0: fix max texture buffer size to 128M elements

This is what the hardware supports, there never was any sort of 64K
limit.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
9 years agost/mesa: avoid integer overflows with buffers >= 512MB
Ilia Mirkin [Tue, 15 Sep 2015 23:32:10 +0000 (19:32 -0400)]
st/mesa: avoid integer overflows with buffers >= 512MB

This fixes failures with the newly-submitted max-size texture buffer
piglit test for GPUs exposing >= 128M max texels.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
9 years agomesa: move GL_APPLE_object_purgeable functions to new file
Brian Paul [Tue, 15 Sep 2015 20:28:38 +0000 (14:28 -0600)]
mesa: move GL_APPLE_object_purgeable functions to new file

Move this code out of bufferobj.c since it's not strongly connected to
buffer objects.

Acked-by: Matt Turner <mattst88@gmail.com>
9 years agomesa: remove trailing whitespace in bufferobj.c
Brian Paul [Tue, 15 Sep 2015 20:04:58 +0000 (14:04 -0600)]
mesa: remove trailing whitespace in bufferobj.c

Trivial.

9 years agomesa: whitespace, line wrap fixes in varray.c
Brian Paul [Tue, 15 Sep 2015 20:03:04 +0000 (14:03 -0600)]
mesa: whitespace, line wrap fixes in varray.c

Trivial.

9 years agonir/print: print symbolic names from shader-enum
Rob Clark [Tue, 15 Sep 2015 22:55:48 +0000 (18:55 -0400)]
nir/print: print symbolic names from shader-enum

v2: split out moving of FILE *fp into state structure into it's own
(more complete patch) to reduce the noise in this one

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
9 years agonir/print: bit of state refactoring
Rob Clark [Tue, 15 Sep 2015 22:50:41 +0000 (18:50 -0400)]
nir/print: bit of state refactoring

Rename print_var_state to print_state, and stuff FILE ptr into the state
object.  This avoids passing around an extra parameter everywhere.

v2: even more extensive conversion.. use state *everywhere* instead of
FILE ptr, and convert nir_print_instr() to use state as well

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
9 years agoglsl: shader-enum to name debug fxns
Rob Clark [Fri, 11 Sep 2015 16:48:05 +0000 (12:48 -0400)]
glsl: shader-enum to name debug fxns

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
9 years agofreedreno: one screen to rule them all
Rob Clark [Fri, 4 Sep 2015 15:35:33 +0000 (11:35 -0400)]
freedreno: one screen to rule them all

Similar to fee0686c21c631d96d6042741267a3c218c23ffc, but in this case to
ensure that drm_gralloc and libGLES_mesa are sharing a single screen.

Bumps libdrm_freedreno version dependency, as it requires the new
fd_device_fd() API.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno/ir3: use NIR to lower ffract instead of tgsi_lowering
Rob Clark [Mon, 14 Sep 2015 15:54:05 +0000 (11:54 -0400)]
freedreno/ir3: use NIR to lower ffract instead of tgsi_lowering

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agonir: add lowering for ffract
Rob Clark [Mon, 14 Sep 2015 15:13:19 +0000 (11:13 -0400)]
nir: add lowering for ffract

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
9 years agoi965/fs: The barrier send uses only 1 payload register
Jordan Justen [Tue, 15 Sep 2015 21:01:17 +0000 (14:01 -0700)]
i965/fs: The barrier send uses only 1 payload register

When preparing the barrier payload, the instructions should operate in
simd8 mode since we only use 1 payload register.

fs_inst::regs_read is also updated to indicate that it only reads one
register for SHADER_OPCODE_BARRIER.

These issues were flagged by:

commit cadd7dd384b33a779d46bd664f456bed4a21a5b7
Author: Jason Ekstrand <jason.ekstrand@intel.com>
Date:   Thu Jul 2 15:41:02 2015 -0700

    i965/fs: Add a very basic validation pass

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agonir/builder: Use a normal temporary array in nir_channel
Jason Ekstrand [Tue, 15 Sep 2015 19:09:06 +0000 (12:09 -0700)]
nir/builder: Use a normal temporary array in nir_channel

C++ gets cranky if we take references of temporaries.  This isn't a problem
yet in master because nir_builder is never used from C++.  However, it will
be in the future so we should fix it now.

Reviewed-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno/a4xx: more texture formats
Rob Clark [Mon, 14 Sep 2015 19:15:06 +0000 (15:15 -0400)]
freedreno/a4xx: more texture formats

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno/a4xx: border-color support
Rob Clark [Tue, 15 Sep 2015 21:25:47 +0000 (17:25 -0400)]
freedreno/a4xx: border-color support

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno/a4xx: wire up texture clamp lowering
Rob Clark [Tue, 15 Sep 2015 21:25:25 +0000 (17:25 -0400)]
freedreno/a4xx: wire up texture clamp lowering

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno: helper for a3xx/a4xx border-colors
Rob Clark [Tue, 15 Sep 2015 13:23:21 +0000 (09:23 -0400)]
freedreno: helper for a3xx/a4xx border-colors

Both use the same layout for the buffer containing border-color values,
so rather than duplicating the logic in a4xx, split it out into a
helper.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno: update generated headers
Rob Clark [Mon, 14 Sep 2015 20:59:36 +0000 (16:59 -0400)]
freedreno: update generated headers

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agonir/lower_vec_to_movs: Coalesce into destinations of fdot instructions
Jason Ekstrand [Thu, 10 Sep 2015 00:18:55 +0000 (17:18 -0700)]
nir/lower_vec_to_movs: Coalesce into destinations of fdot instructions

Now that we have a replicating fdot instruction, we can actually coalesce
into the destinations of vec4 instructions.  We couldn't really do this
before because, if the destination had to end up in .z, we couldn't
reswizzle the instruction.  With a replicated destination, the result ends
up in all channels so we can just set the writemask and we're done.

Shader-db results for vec4 programs on Haswell:

   total instructions in shared programs: 1747753 -> 1746280 (-0.08%)
   instructions in affected programs:     143274 -> 141801 (-1.03%)
   helped:                                667
   HURT:                                  0

It turns out that dot-products matter...

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
9 years agoi965/vec4: Use the replicated fdot instruction in NIR
Jason Ekstrand [Thu, 10 Sep 2015 18:08:15 +0000 (11:08 -0700)]
i965/vec4: Use the replicated fdot instruction in NIR

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
9 years agonir: Add a fdot instruction that replicates the result to a vec4
Jason Ekstrand [Thu, 10 Sep 2015 17:51:46 +0000 (10:51 -0700)]
nir: Add a fdot instruction that replicates the result to a vec4

Fortunately, nir_constant_expr already auto-splats if "dst" never shows up
in the constant expression field so we don't need to do anything there.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
9 years agonir/lower_vec_to_movs: Coalesce movs on-the-fly when possible
Jason Ekstrand [Wed, 9 Sep 2015 21:40:06 +0000 (14:40 -0700)]
nir/lower_vec_to_movs: Coalesce movs on-the-fly when possible

The old pass blindly inserted a bunch of moves into the shader with no
concern for whether or not it was really needed.  This adds code to try and
coalesce into the destination of the instruction providing the value.

Shader-db results for vec4 shaders on Haswell:

   total instructions in shared programs: 1754420 -> 1747753 (-0.38%)
   instructions in affected programs:     231230 -> 224563 (-2.88%)
   helped:                                1017
   HURT:                                  2

This approach is heavily based on a different patch by Eduardo Lima Mitev
<elima@igalia.com>.  Eduardo's patch did this in a separate pass as opposed
to integrating it into nir_lower_vec_to_movs.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
9 years agonir/lower_vec_to_movs: Get rid of start_idx and swizzle compacting
Jason Ekstrand [Wed, 9 Sep 2015 21:47:28 +0000 (14:47 -0700)]
nir/lower_vec_to_movs: Get rid of start_idx and swizzle compacting

Previously, we did this thing with keeping track of a separate start_idx
which was different from the iteration variable.  I think this was a relic
of the way that GLSL IR implements writemasks.  In NIR, if a given bit in
the writemask is unset then that channel is just "unused", not missing.  In
particular, a vec4 operation with a writemask of 0xd will use sources 0, 2,
and 3 and leave source 1 alone.  We can simplify things a good deal (and
make them correct) by removing this "compacting" step.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
9 years agoi965/vec4_nir: Use partial SSA form rather than full non-SSA
Jason Ekstrand [Wed, 9 Sep 2015 20:55:39 +0000 (13:55 -0700)]
i965/vec4_nir: Use partial SSA form rather than full non-SSA

We made this switch in the FS backend some time ago and it seems to make a
number of things a bit easier.  In particular, supporting SSA values takes
very little work in the backend and allows us to take advantage of the
majority of the SSA information even after we've gotten rid of Phi nodes.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
9 years agonir/lower_vec_to_movs: Handle partially SSA shaders
Jason Ekstrand [Wed, 9 Sep 2015 20:42:14 +0000 (13:42 -0700)]
nir/lower_vec_to_movs: Handle partially SSA shaders

v2 (Jason Ekstrand):
 - Use nir_instr_rewrite_dest
 - Pass the impl directly into lower_vec_to_movs_block

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
9 years agonir/lower_vec_to_movs: Pass the shader around directly
Jason Ekstrand [Wed, 9 Sep 2015 19:58:58 +0000 (12:58 -0700)]
nir/lower_vec_to_movs: Pass the shader around directly

Previously, we were passing the shader around, we were just calling it
"mem_ctx".  However, the nir_shader is (and must be for the purposes of
mark-and-sweep) the mem_ctx so we might as well pass it around explicitly.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
9 years agoi965/fs: Add a very basic validation pass
Jason Ekstrand [Thu, 2 Jul 2015 22:41:02 +0000 (15:41 -0700)]
i965/fs: Add a very basic validation pass

Currently the validation pass only validates that regs_read and
regs_written are consistent with the sizes of VGRF's.  We can add more as
we find it to be useful.

Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agoi965/fs_surface_builder: Only apply predicate to components that exist
Jason Ekstrand [Mon, 14 Sep 2015 22:36:24 +0000 (15:36 -0700)]
i965/fs_surface_builder: Only apply predicate to components that exist

In certain conditions, we have to do bounds-checking in the shader for
image_load_store.  The way this works for image loads is that we do a
predicated load and then emit a series of selects, one per component,
that gives us 0 or the loaded value depending on whether or not you're
in bounds.  However, we were hard-coding 4 components which may not be
correct.  Instead, we should be using size which is the number of
components read.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
9 years agoi965/fs: Only read output_components many components when writing an output
Jason Ekstrand [Mon, 14 Sep 2015 21:18:13 +0000 (14:18 -0700)]
i965/fs: Only read output_components many components when writing an output

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
9 years agoi965/fs: Set output_components for lowered clip distance outputs
Jason Ekstrand [Mon, 14 Sep 2015 22:09:00 +0000 (15:09 -0700)]
i965/fs: Set output_components for lowered clip distance outputs

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
9 years agomesa/teximage: restrict GL_ETC1_RGB8_OES support to GLES
Nanley Chery [Thu, 27 Aug 2015 23:05:22 +0000 (16:05 -0700)]
mesa/teximage: restrict GL_ETC1_RGB8_OES support to GLES

According to the extensions table and our glext headers,
OES_compressed_ETC1_RGB8_texture is only supported in
GLES1 and GLES2. Since we may give users a GLES3 context
when a GLES2 context is requested, we also allow this
extension for GLES3 as well.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agomesa/extensions: restrict GL_OES_EGL_image to GLES
Nanley Chery [Thu, 10 Sep 2015 17:48:46 +0000 (10:48 -0700)]
mesa/extensions: restrict GL_OES_EGL_image to GLES

Driver vendors do this as well. The extension specification
lists GLES 1.1 or 2.0 as requirements.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agomesa/extensions: restrict luminance alpha formats to API_OPENGL_COMPAT
Nanley Chery [Thu, 27 Aug 2015 23:05:22 +0000 (16:05 -0700)]
mesa/extensions: restrict luminance alpha formats to API_OPENGL_COMPAT

According the GL 3.1 spec, luminance alpha formats are deprecated.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agogallium/svga: Enable PIPE_FORMAT_L8_UNORM for vgpu10
Thomas Hellstrom [Tue, 15 Sep 2015 06:40:07 +0000 (23:40 -0700)]
gallium/svga: Enable PIPE_FORMAT_L8_UNORM for vgpu10

It's extensively used by XA for a8- and planar yuv component surfaces.
This fixes broken XA yuv blits using vgpu10 contexts.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agoegl/dri2: don't leak the fd on dri2_terminate
Emil Velikov [Thu, 10 Sep 2015 13:41:38 +0000 (14:41 +0100)]
egl/dri2: don't leak the fd on dri2_terminate

Currently the check was incorrect as it did not consider the (unlikely)
case of fd == 0. In order to fix this we should first correctly
initialize it to -1, as the swrast implementations leave it set to zero
(props to calloc()).

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Boyan Ding <boyan.j.ding@gmail.com>
9 years agoegl/dri2/drm: compact existing device mgmt
Emil Velikov [Mon, 7 Sep 2015 08:53:53 +0000 (09:53 +0100)]
egl/dri2/drm: compact existing device mgmt

Move the fcntl(dupfd_cloexec) to the else branch where it belongs.
Otherwise it's not immediately obvious that the code is hit, only when
an existing device is used.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Boyan Ding <boyan.j.ding@gmail.com>
9 years agoegl/dri2: Close file descriptor on error.
Matt Turner [Wed, 15 Jul 2015 16:00:41 +0000 (09:00 -0700)]
egl/dri2: Close file descriptor on error.

v2: [Emil Velikov]
Rework the error path to a common goto, close only if we own the fd.
v3; [Emil Velikov]
Always close the fd (we either opened the device or dup'd) (Boyan, Ian)

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Boyan Ding <boyan.j.ding@gmail.com>
9 years agogbm: convert gbm bo format to fourcc format on dma-buf import
Ray Strode [Fri, 28 Aug 2015 18:50:21 +0000 (14:50 -0400)]
gbm: convert gbm bo format to fourcc format on dma-buf import

At the moment if a gbm buffer is imported and the gbm buffer
has an old-style GBM_BO_FORMAT format, the import will crash,
since it's passed directly to DRI functions that expect
a fourcc format (as provided by the newer GBM_FORMAT
definitions)

This commit addresses the problem in two ways:

1) it prevents invalid formats from leading to a crash by
returning EINVAL if the image couldn't be created

2) it translates GBM_BO_FORMAT formats into the comparable
GBM_FORMAT formats.

Reference: https://bugzilla.gnome.org/show_bug.cgi?id=753531
CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
9 years agodocs: document INTEL_DEBUG 'optimizer' envvar
Alejandro Piñeiro [Mon, 14 Sep 2015 18:16:25 +0000 (20:16 +0200)]
docs: document INTEL_DEBUG 'optimizer' envvar

Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agoi965: Move perf_debug code to brw_codegen_*_prog()
Kristian Høgsberg Kristensen [Sat, 5 Sep 2015 00:09:40 +0000 (17:09 -0700)]
i965: Move perf_debug code to brw_codegen_*_prog()

We're trying to avoid a libdrm dependency in the core compiler, so let's
move the perf_debug code one level up from the brw_*_emit() helpers to
the brw_codegen_*_prog() helpers.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
9 years agoi965: Move brw_fs_precompile() to brw_wm.c
Kristian Høgsberg Kristensen [Fri, 4 Sep 2015 23:55:03 +0000 (16:55 -0700)]
i965: Move brw_fs_precompile() to brw_wm.c

All other precompile functions live in the brw_<stage>.c files, make fs
follow the convention.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
9 years agoi965: Move compute shader code around
Kristian Høgsberg Kristensen [Fri, 4 Sep 2015 23:35:34 +0000 (16:35 -0700)]
i965: Move compute shader code around

This moves the compute shader code around in order to make the way the
code is split up more consistent. There should be no functional changes.
Typically we have a few files per stage:

    brw_vs.c, brw_wm.c brw_gs.c:

        code to drive code generation and implement precompiling and
        cache search.

    genX_<stage>_state.c

        gen specific implementation of the state emission for the shader
        stage.

The brw_*_emit() functions are all in the same files as the visitor
classes they use (with the exception of VS, which may use either vec4 or
fs).

To make compute follow this convention, we move the brw_cs_emit()
function into brw_fs.cpp. We can then rename brw_cs.cpp to brw_cs.c and
do this in C like the other similar files.  Finally, move state setup
and atoms to gen7_cs_state.c.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
9 years agometa: Abort meta pbo path if TexSubImage need signed unsigned conversion
Anuj Phogat [Fri, 24 Jul 2015 22:53:58 +0000 (15:53 -0700)]
meta: Abort meta pbo path if TexSubImage need signed unsigned conversion

See similar fix for Readpixels in mesa commit 0d20790. Jason suggested
we need that for TexSubImage as well.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agonvc0/ir: start offset at texBindBase for txq, like regular texturing
Ilia Mirkin [Fri, 11 Sep 2015 03:58:17 +0000 (23:58 -0400)]
nvc0/ir: start offset at texBindBase for txq, like regular texturing

Curiously this has no actual effect. I think it's because the first 8
textures are bound in multiple slots for some reason. However seems
prudent to use these the same way as regular texturing, esp in the case
where there are more than 8 textures bound.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agovc4: Fix build from recent NIR cleanups.
Eric Anholt [Mon, 14 Sep 2015 15:21:07 +0000 (11:21 -0400)]
vc4: Fix build from recent NIR cleanups.

9 years agoi965/vec4_nir: Load constants as integers
Antia Puentes [Mon, 14 Sep 2015 07:50:59 +0000 (09:50 +0200)]
i965/vec4_nir: Load constants as integers

Loads constants using integer as their register type, like it is
done in FS backend.

No shader-db changes in HSW.

Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91716
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agoi965/vec4: Fix saturation errors when coalescing registers
Antia Puentes [Wed, 5 Aug 2015 13:57:33 +0000 (15:57 +0200)]
i965/vec4: Fix saturation errors when coalescing registers

If the register types do not match and the instruction
that contains the final destination is saturated, register
coalescing generated non-equivalent code.

This did not happen when using IR because types usually
matched, but it is visible in nir-vec4.

For example,
   mov      vgrf7:D vgrf2:D
   mov.sat  m4:F vgrf7:F

is coalesced to:
   mov.sat  m4:D vgrf2:D

The patch prevents coalescing in such scenario, unless the
instruction we want to coalesce into is a MOV (without type
conversion implied). In that case, the patch sets the register
types to the type of the final destination.

Shader-db results in HSW (only vec4 instructions shown):

total instructions in shared programs: 1754415 -> 1754416 (0.00%)
instructions in affected programs:     74 -> 75 (1.35%)
helped:                                0
HURT:                                  1
GAINED:                                0
LOST:                                  0

Only one extra instruction in one of the shaders, that comes from
eliminating a saturation error by preventing register coalesce.

Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agodocs: cleanups + mark some work as done
Tapani Pälli [Mon, 14 Sep 2015 05:50:51 +0000 (08:50 +0300)]
docs: cleanups + mark some work as done

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agodocs: only astc ldr required for ES3.2, not hdr
Ilia Mirkin [Mon, 14 Sep 2015 05:07:05 +0000 (01:07 -0400)]
docs: only astc ldr required for ES3.2, not hdr

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
9 years agost/mesa: emit TXQS, support ARB_shader_texture_image_samples
Ilia Mirkin [Fri, 11 Sep 2015 01:44:45 +0000 (21:44 -0400)]
st/mesa: emit TXQS, support ARB_shader_texture_image_samples

The image component of the ext is a no-op since there is no image support
in gallium (yet).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agor600g: add support for TXQS tgsi opcode
Ilia Mirkin [Fri, 11 Sep 2015 02:33:34 +0000 (22:33 -0400)]
r600g: add support for TXQS tgsi opcode

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
9 years agonv50/ir: add support for TXQS tgsi opcode
Ilia Mirkin [Fri, 11 Sep 2015 02:07:27 +0000 (22:07 -0400)]
nv50/ir: add support for TXQS tgsi opcode

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agogallium: add PIPE_CAP_TGSI_TXQS to let st know if TXQS is supported
Ilia Mirkin [Fri, 11 Sep 2015 21:29:49 +0000 (17:29 -0400)]
gallium: add PIPE_CAP_TGSI_TXQS to let st know if TXQS is supported

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
9 years agotgsi: add a TXQS opcode to retrieve the number of texture samples
Ilia Mirkin [Fri, 11 Sep 2015 01:37:23 +0000 (21:37 -0400)]
tgsi: add a TXQS opcode to retrieve the number of texture samples

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
9 years agoglsl/cs: Initialize gl_LocalInvocationIndex in main()
Jordan Justen [Mon, 17 Aug 2015 23:32:42 +0000 (16:32 -0700)]
glsl/cs: Initialize gl_LocalInvocationIndex in main()

We initialize gl_LocalInvocationIndex based on the extension spec
formula:

    gl_LocalInvocationIndex =
        gl_LocalInvocationID.z * gl_WorkGroupSize.x * gl_WorkGroupSize.y +
        gl_LocalInvocationID.y * gl_WorkGroupSize.x +
        gl_LocalInvocationID.x;

https://www.opengl.org/registry/specs/ARB/compute_shader.txt

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
9 years agoglsl/cs: Exclude gl_LocalInvocationIndex from builtin variable stripping
Jordan Justen [Mon, 17 Aug 2015 22:49:44 +0000 (15:49 -0700)]
glsl/cs: Exclude gl_LocalInvocationIndex from builtin variable stripping

We lower gl_LocalInvocationIndex based on the extension spec formula:

    gl_LocalInvocationIndex =
        gl_LocalInvocationID.z * gl_WorkGroupSize.x * gl_WorkGroupSize.y +
        gl_LocalInvocationID.y * gl_WorkGroupSize.x +
        gl_LocalInvocationID.x;

https://www.opengl.org/registry/specs/ARB/compute_shader.txt

We need to set this variable in main(), even if gl_LocalInvocationIndex
is not referenced by the shader. (It may be used by a linked shader.)
Therefore, we can't eliminate it as a dead variable.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
9 years agoglsl/cs: Initialize gl_GlobalInvocationID in main()
Jordan Justen [Mon, 17 Aug 2015 21:35:44 +0000 (14:35 -0700)]
glsl/cs: Initialize gl_GlobalInvocationID in main()

We initialize gl_GlobalInvocationID based on the extension spec
formula:

    gl_GlobalInvocationID =
        gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID

https://www.opengl.org/registry/specs/ARB/compute_shader.txt

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
9 years agoglsl: Move link_get_main_function_signature to a common location
Jordan Justen [Mon, 17 Aug 2015 19:22:34 +0000 (12:22 -0700)]
glsl: Move link_get_main_function_signature to a common location

Also rename to _mesa_get_main_function_signature.

We will call it near the end of compilation to insert some code into
main for initializing some compute shader global variables.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
9 years agoglsl/cs: Don't strip gl_GlobalInvocationID and dependencies
Jordan Justen [Mon, 17 Aug 2015 19:30:25 +0000 (12:30 -0700)]
glsl/cs: Don't strip gl_GlobalInvocationID and dependencies

We lower gl_GlobalInvocationID based on the extension spec formula:

    gl_GlobalInvocationID =
        gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID

https://www.opengl.org/registry/specs/ARB/compute_shader.txt

We need to set this variable in main(), even if gl_GlobalInvocationID
is not referenced by the shader. (It may be used by a linked shader.)
Therefore, we can't eliminate these as dead variables.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
9 years agoi965/nir: Support gl_WorkGroupID variable
Jordan Justen [Fri, 13 Mar 2015 18:39:53 +0000 (11:39 -0700)]
i965/nir: Support gl_WorkGroupID variable

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
9 years agoi965/cs: Initialize gl_WorkGroupID variable from payload
Jordan Justen [Fri, 10 Oct 2014 15:28:24 +0000 (08:28 -0700)]
i965/cs: Initialize gl_WorkGroupID variable from payload

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
9 years agonir: Add gl_WorkGroupID system variable
Jordan Justen [Fri, 13 Mar 2015 18:37:03 +0000 (11:37 -0700)]
nir: Add gl_WorkGroupID system variable

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
9 years agoglsl/cs: Add gl_WorkGroupID variable
Jordan Justen [Fri, 10 Oct 2014 15:28:24 +0000 (08:28 -0700)]
glsl/cs: Add gl_WorkGroupID variable

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
9 years agoi965/nir: Support gl_LocalInvocationID variable
Jordan Justen [Fri, 13 Mar 2015 18:34:48 +0000 (11:34 -0700)]
i965/nir: Support gl_LocalInvocationID variable

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
9 years agoi965/cs: Initialize gl_LocalInvocationID from payload
Jordan Justen [Sat, 22 Nov 2014 03:14:41 +0000 (19:14 -0800)]
i965/cs: Initialize gl_LocalInvocationID from payload

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
9 years agoi965/cs: Initialize gl_LocalInvocationID in push constant data
Jordan Justen [Fri, 10 Oct 2014 15:33:23 +0000 (08:33 -0700)]
i965/cs: Initialize gl_LocalInvocationID in push constant data

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
9 years agoi965/cs: Reserve local invocation id in payload regs
Jordan Justen [Sat, 22 Nov 2014 02:47:49 +0000 (18:47 -0800)]
i965/cs: Reserve local invocation id in payload regs

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
9 years agonir: Add gl_LocalInvocationID variable
Jordan Justen [Fri, 13 Mar 2015 18:32:43 +0000 (11:32 -0700)]
nir: Add gl_LocalInvocationID variable

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
9 years agoglsl/cs: Add gl_LocalInvocationID variable
Jordan Justen [Fri, 10 Oct 2014 15:28:24 +0000 (08:28 -0700)]
glsl/cs: Add gl_LocalInvocationID variable

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
9 years agosoftpipe: Change faces type to uint
Krzesimir Nowak [Sat, 12 Sep 2015 14:17:00 +0000 (08:17 -0600)]
softpipe: Change faces type to uint

This is to avoid needless float<->int conversions, since all
face-related computations are made on integers. Spotted by Emil
Velikov.

Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agofreedreno/ir3: fix compile warn after 1807a08e
Rob Clark [Sun, 13 Sep 2015 15:22:51 +0000 (11:22 -0400)]
freedreno/ir3: fix compile warn after 1807a08e

New enum to add to switch so compiler doesn't complain.

   commit 1807a08e4f35b014f2a80d1e88dd74a9f096d7a5
   Author:     Ilia Mirkin <imirkin@alum.mit.edu>
   AuthorDate: Thu Aug 27 23:05:03 2015 -0400
   Commit:     Ilia Mirkin <imirkin@alum.mit.edu>
   CommitDate: Thu Sep 10 17:38:33 2015 -0400

       nir: add nir_texop_texture_samples and convert from glsl

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno/ir3: fix compile break after a4aa25be
Rob Clark [Sun, 13 Sep 2015 15:21:28 +0000 (11:21 -0400)]
freedreno/ir3: fix compile break after a4aa25be

Following commit dropped the unused memctx arg:

   commit a4aa25be1e0a27b1a6a6b0bcf576beb9dfe1ea7a
   Author:     Jason Ekstrand <jason.ekstrand@intel.com>
   AuthorDate: Wed Sep 9 13:24:35 2015 -0700
   Commit:     Jason Ekstrand <jason.ekstrand@intel.com>
   CommitDate: Fri Sep 11 09:21:20 2015 -0700

       nir: Remove the mem_ctx parameter from ssa_def_rewrite_uses

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agonir: add nir_channel() to get at single components of vec's
Rob Clark [Thu, 10 Sep 2015 20:06:05 +0000 (16:06 -0400)]
nir: add nir_channel() to get at single components of vec's

Rather than make yet another copy of channel(), let's move it into nir.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agotgsi/scan: add support to figure out max nesting depth
Rob Clark [Wed, 9 Sep 2015 22:28:55 +0000 (18:28 -0400)]
tgsi/scan: add support to figure out max nesting depth

Sometimes a useful thing for compilers (or, for example, tgsi_to_nir) to
know.  And pretty trivial for scan to figure this out for us.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agor600: Fix llvm build since const buffer changes
Kai Wasserbäch [Sat, 12 Sep 2015 08:39:50 +0000 (10:39 +0200)]
r600: Fix llvm build since const buffer changes

In commit f9caabe8f1bff86d19b53d9ecba5c72b238d9e23:

One place in r600_llvm.c was forgotten when replacing
R600_UCP_CONST_BUFFER with R600_BUFFER_INFO_CONST_BUFFER.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91985
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Signed-off-by: Dave Airlie <airlied@gmail.com>
9 years agoi965/vec4: Don't reswizzle hardware registers
Jason Ekstrand [Thu, 10 Sep 2015 23:19:42 +0000 (16:19 -0700)]
i965/vec4: Don't reswizzle hardware registers

Cc: "11.0 10.6" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91719
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965/emit: Add assertions for accumulator restrictions
Jason Ekstrand [Thu, 10 Sep 2015 23:19:22 +0000 (16:19 -0700)]
i965/emit: Add assertions for accumulator restrictions

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agodocs: add news item and link release notes for 11.0.0
Emil Velikov [Sat, 12 Sep 2015 12:50:33 +0000 (13:50 +0100)]
docs: add news item and link release notes for 11.0.0

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
9 years agodocs: add sha256 checksums for 11.0.0
Emil Velikov [Sat, 12 Sep 2015 12:32:56 +0000 (13:32 +0100)]
docs: add sha256 checksums for 11.0.0

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit c4bae5792bb5515da42e23f166f5ba5d68f79615)

9 years agodocs: Update 11.0.0 release notes
Emil Velikov [Sat, 12 Sep 2015 09:33:49 +0000 (10:33 +0100)]
docs: Update 11.0.0 release notes

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 4f1e500150be2e82a2d7eb954f7198cc0c5cbec1)

9 years agor600: Enable fp64 on chips with native support
Glenn Kennard [Fri, 11 Sep 2015 10:42:23 +0000 (12:42 +0200)]
r600: Enable fp64 on chips with native support

Cypress/Cayman/Aruba, earlier r6xx/r7xx chips only support a subset
of the needed fp64 ops, and don't do GL4 anyway.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agor600g: Support I2D/U2D/D2I/D2U
Glenn Kennard [Fri, 11 Sep 2015 10:42:22 +0000 (12:42 +0200)]
r600g: Support I2D/U2D/D2I/D2U

Only for Cypress/Cayman/Aruba, older chips have only partial fp64 support.
Uses float intermediate values so only accurate for int24 range, which
matches what the blob does.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agor600g: lower number of driver const buffers
Dave Airlie [Fri, 11 Sep 2015 03:43:53 +0000 (04:43 +0100)]
r600g: lower number of driver const buffers

I'm going to want a driver constant buffer for tess to coordinate
LDS storage, so before I go tackling that I decided to merge the
clip/samplepos and texture info buffers into one. So I can steal
the spare one.

This creates a single constant buffer between the two, with
clip/samplepos taking up a reserved 128 bytes at the start.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agor600: define some values for the fetch constant offsets.
Dave Airlie [Fri, 11 Sep 2015 02:11:43 +0000 (03:11 +0100)]
r600: define some values for the fetch constant offsets.

This just puts these in one place and #defines them.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agodocs: Update with GLES3.2 entries and status
Thomas Helland [Wed, 12 Aug 2015 13:07:57 +0000 (15:07 +0200)]
docs: Update with GLES3.2 entries and status

V2: -Change to "not started" for most entries
    -Add status for multisample_2d_array
    -Change shader_multisample_interpolation to "not_stared"

V3 (idr): Move the GLES 3.2 section after the "Additional functions"
section from GLES 3.1.  Note that GL_KHR_texture_compression_astc_hdr is
done for i965 on gen9+ hardware.  Note that GL_OES_shader_io_blocks is
based on some features from GLSL 1.50.

Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com> [v2]
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agosoftpipe: Constify variables
Krzesimir Nowak [Fri, 11 Sep 2015 18:07:42 +0000 (20:07 +0200)]
softpipe: Constify variables

This commit makes a lot of variables constant - this is basically done
by moving the computation to variable definition. Some of them are
moved into lower scopes (like in img_filter_2d_ewa).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
9 years agosoftpipe: Constify sp_tgsi_sampler
Krzesimir Nowak [Fri, 11 Sep 2015 18:07:41 +0000 (20:07 +0200)]
softpipe: Constify sp_tgsi_sampler

Add a small inline function doing the casting - this is to make sure
we don't do a cast from some completely unrelated type. This commit
does not make tgsi_sampler parameters const in vfuncs themselves for
now - probably llvmpipe would need looking at before making such a
change.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
9 years agosoftpipe: Constify sampler and view parameters in mip filters
Krzesimir Nowak [Fri, 11 Sep 2015 18:07:40 +0000 (20:07 +0200)]
softpipe: Constify sampler and view parameters in mip filters

Those functions actually could always take them as constants.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
9 years agosoftpipe: Constify sampler and view parameters in img filters
Krzesimir Nowak [Fri, 11 Sep 2015 18:07:39 +0000 (20:07 +0200)]
softpipe: Constify sampler and view parameters in img filters

Those functions actually could always take them as constants.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
9 years agotgsi, softpipe: Constify tgsi_sampler in query_lod vfunc
Krzesimir Nowak [Fri, 11 Sep 2015 18:07:38 +0000 (20:07 +0200)]
tgsi, softpipe: Constify tgsi_sampler in query_lod vfunc

A followup from previous commit - since all functions called by
query_lod take pointers to const sp_sampler_view and const sp_sampler,
which are taken from tgsi_sampler subclass, we can the tgsi_sampler as
const itself now.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
9 years agosoftpipe: Constify some sampler and view parameters
Krzesimir Nowak [Fri, 11 Sep 2015 18:07:37 +0000 (20:07 +0200)]
softpipe: Constify some sampler and view parameters

This is to prepare for making tgsi_sampler parameter in query_lod a
const too. These functions do not modify anything in either sampler or
view anymore.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
9 years agosoftpipe: Move the faces array from view to filter_args
Krzesimir Nowak [Fri, 11 Sep 2015 18:07:36 +0000 (20:07 +0200)]
softpipe: Move the faces array from view to filter_args

With that, sp_sampler_view instances are not abused anymore as a local
storage, so we can later make them constant.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
9 years agonir/from_ssa: Use instr_rewrite_dest
Jason Ekstrand [Wed, 9 Sep 2015 23:03:10 +0000 (16:03 -0700)]
nir/from_ssa: Use instr_rewrite_dest

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
9 years agonir: Add a function for rewriting instruction destinations
Jason Ekstrand [Wed, 9 Sep 2015 22:58:25 +0000 (15:58 -0700)]
nir: Add a function for rewriting instruction destinations

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
9 years agonir: Only unlink sources that are actually valid
Jason Ekstrand [Wed, 9 Sep 2015 22:58:08 +0000 (15:58 -0700)]
nir: Only unlink sources that are actually valid

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
9 years agonir: Remove the mem_ctx parameter from ssa_def_rewrite_uses
Jason Ekstrand [Wed, 9 Sep 2015 20:24:35 +0000 (13:24 -0700)]
nir: Remove the mem_ctx parameter from ssa_def_rewrite_uses

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
9 years agonir: Fix a bunch of ralloc parenting errors
Jason Ekstrand [Wed, 9 Sep 2015 20:18:29 +0000 (13:18 -0700)]
nir: Fix a bunch of ralloc parenting errors

As of a10d4937, we would really like things associated with an instruction
to be allocated out of that instruction and not out of the shader.  In
particular, you should be passing the instruction that will ultimately be
holding the source into nir_src_copy rather than an arbitrary memory
context.

We also change the prototypes of nir_dest_copy and nir_alu_src/dest_copy to
explicitly take an instruction so we catch this earlier in the future.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
9 years agonir/lower_outputs_to_temporaries: Reparent the output name
Jason Ekstrand [Thu, 10 Sep 2015 20:56:08 +0000 (13:56 -0700)]
nir/lower_outputs_to_temporaries: Reparent the output name

We copy the output, make the old output the temporary, and give the
temporary a new name.  The copy keeps the pointer to the old name.  This
works just fine up until the point where we lower things to SSA and delete
the old variable and, with it, the name.  Instead, we should re-parent to
the copy.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>