mesa.git
8 years agoradeonsi: add HUD queries for counting VS/PS/CS partial flushes
Marek Olšák [Tue, 23 Aug 2016 13:17:35 +0000 (15:17 +0200)]
radeonsi: add HUD queries for counting VS/PS/CS partial flushes

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agogallium/radeon: rename the num-cs-flushes query to num-ctx-flushes
Marek Olšák [Tue, 23 Aug 2016 13:07:35 +0000 (15:07 +0200)]
gallium/radeon: rename the num-cs-flushes query to num-ctx-flushes

num-cs-flushes will mean compute shader flushes

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: fix a badly implemented GS bug workaround
Marek Olšák [Tue, 23 Aug 2016 15:58:22 +0000 (17:58 +0200)]
radeonsi: fix a badly implemented GS bug workaround

Limit it to geometry shaders and Hawaii.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: fix texture format reinterpretation with DCC
Marek Olšák [Mon, 22 Aug 2016 11:45:05 +0000 (13:45 +0200)]
radeonsi: fix texture format reinterpretation with DCC

DCC is limited in how texture formats can be reinterpreted using texture
views. If we get a view format that is incompatible with the initial
texture format with respect to DCC, disable DCC.

There is a new piglit which tests all format combinations.
What works and what doesn't was deduced by looking at the piglit failures.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: fix Gather4 with integer formats
Marek Olšák [Sat, 20 Aug 2016 14:50:01 +0000 (16:50 +0200)]
radeonsi: fix Gather4 with integer formats

The closed compiler does the same thing.

This fixes: GL45-CTS.texture_gather.*-int-* (18 tests)

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: fix a crash in imageSize for cubemap arrays
Marek Olšák [Sat, 20 Aug 2016 23:32:22 +0000 (01:32 +0200)]
radeonsi: fix a crash in imageSize for cubemap arrays

Sometimes it was f32, other times it was i32. Now it's always i32.

This fixes:
GL45-CTS.texture_cube_map_array.image_texture_size.texture_size_compute_sh

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: fix gl_PatchVerticesIn for tessellation evaluation shader
Marek Olšák [Fri, 19 Aug 2016 23:42:09 +0000 (01:42 +0200)]
radeonsi: fix gl_PatchVerticesIn for tessellation evaluation shader

This fixes:
GL45-CTS.tessellation_shader.tessellation_control_to_tessellation_evaluation
.gl_PatchVerticesIn

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: fix cubemaps viewed as 2D
Marek Olšák [Fri, 19 Aug 2016 17:52:14 +0000 (19:52 +0200)]
radeonsi: fix cubemaps viewed as 2D

This fixes: GL43-CTS.texture_view.view_sampling

v2: fix a typo, merge both if statements

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: always use the same function signature for llvm.SI.export
Marek Olšák [Sat, 27 Aug 2016 09:48:14 +0000 (11:48 +0200)]
radeonsi: always use the same function signature for llvm.SI.export

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: return correct eviction stats for NVX_gpu_memory_info
Marek Olšák [Sun, 21 Aug 2016 10:00:01 +0000 (12:00 +0200)]
radeonsi: return correct eviction stats for NVX_gpu_memory_info

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agogallium/radeon: also eliminate DCC fast clear in resource_get_handle
Marek Olšák [Sun, 21 Aug 2016 10:39:21 +0000 (12:39 +0200)]
gallium/radeon: also eliminate DCC fast clear in resource_get_handle

just do what the comment says

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agogallium/radeon: use the current ctx for CMASK elimination in resource_get_handle
Marek Olšák [Sun, 21 Aug 2016 10:30:21 +0000 (12:30 +0200)]
gallium/radeon: use the current ctx for CMASK elimination in resource_get_handle

For coherency with the current context.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agogallium/radeon: use the current ctx for DCC decompression in resource_get_handle
Marek Olšák [Sun, 21 Aug 2016 10:30:21 +0000 (12:30 +0200)]
gallium/radeon: use the current ctx for DCC decompression in resource_get_handle

For coherency with the current context.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agogallium/radeon: derive buffer placement and flags only at initialization
Marek Olšák [Thu, 18 Aug 2016 14:30:00 +0000 (16:30 +0200)]
gallium/radeon: derive buffer placement and flags only at initialization

Invalidated buffers don't have to go through it.

Split r600_init_resource into r600_init_resource_fields and
r600_alloc_resource.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: set more sampler settings
Marek Olšák [Sun, 21 Aug 2016 14:13:16 +0000 (16:13 +0200)]
radeonsi: set more sampler settings

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agodocs: add news item and link release notes for 12.0.2
Emil Velikov [Mon, 5 Sep 2016 15:13:48 +0000 (16:13 +0100)]
docs: add news item and link release notes for 12.0.2

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
8 years agodocs: add sha256 checksums for 12.0.2
Emil Velikov [Mon, 5 Sep 2016 15:03:06 +0000 (16:03 +0100)]
docs: add sha256 checksums for 12.0.2

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 614fb93a6d0246d5592333a1b914ce71a409fcf7)

8 years agodocs: add release notes for 12.0.2
Emil Velikov [Mon, 5 Sep 2016 11:14:11 +0000 (12:14 +0100)]
docs: add release notes for 12.0.2

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 2fc6a31f10e908af8f348aba796d0e6b1616b863)

8 years agonoop: implement resource_get_handle
Marek Olšák [Sun, 28 Aug 2016 16:50:19 +0000 (18:50 +0200)]
noop: implement resource_get_handle

X+DRI3 locks up if the returned handle is invalid.

8 years agonoop: set missing functions
Marek Olšák [Sun, 28 Aug 2016 11:58:16 +0000 (13:58 +0200)]
noop: set missing functions

8 years agonoop: simplify some functions
Marek Olšák [Sun, 28 Aug 2016 11:57:44 +0000 (13:57 +0200)]
noop: simplify some functions

8 years agoglx/glvnd: list the strcmp arguments in correct order
Emil Velikov [Thu, 1 Sep 2016 09:36:44 +0000 (10:36 +0100)]
glx/glvnd: list the strcmp arguments in correct order

Currently, due to the inverse order, strcmp will produce negative result
when the needle is towards the start of the haystack. Thus on the next
iteration(s) we'll end up further towards the end and eventually fail to
locate the entry.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
8 years agonir/tests: Update the CF tests to not assume fake edges
Jason Ekstrand [Sat, 3 Sep 2016 18:57:05 +0000 (11:57 -0700)]
nir/tests: Update the CF tests to not assume fake edges

In aad4f1550, we removed the concept of "fake" edges from NIR.  Now, if you
have a block at the end of an infinite loop it really has no predecessors.
This updates the unit tests to match.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97587
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
8 years agogk110/ir: fix quadop dall emission
Ilia Mirkin [Sun, 4 Sep 2016 22:21:29 +0000 (18:21 -0400)]
gk110/ir: fix quadop dall emission

We recently starting to always emit the NDV (== dall) bit for quadops.
However it was folded into the wrong code word.

Fixes: e0a067ed48 (nv50/ir: always emit the NDV bit for OP_QUADOP)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
8 years agoandroid: intel: fix include paths in new "common" library
Mauro Rossi [Sun, 4 Sep 2016 00:00:24 +0000 (02:00 +0200)]
android: intel: fix include paths in new "common" library

Fixes building error in libmesa_intel_common static library

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoa3xx: use window scissor to simulate viewport xy clip
Ilia Mirkin [Wed, 31 Aug 2016 02:42:24 +0000 (22:42 -0400)]
a3xx: use window scissor to simulate viewport xy clip

Unfortunately a3xx does not have a separate disable for depth clipping,
so when depth clamp is enabled, we disable the whole 3d clipper logic.
This in turn also gets rid of the xy clip that it would normally do.
When we detect this would happen, instead we integrate the viewport into
the window scissor. This may have slightly different behavior around
wide points, but it's unlikely that anything depends on this.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
8 years agoa3xx: make use of software clipping when hw can't handle it
Ilia Mirkin [Sat, 20 Aug 2016 04:14:43 +0000 (00:14 -0400)]
a3xx: make use of software clipping when hw can't handle it

The hw clipper only handles up to 6 UCPs. If there are more than 6 UCPs,
or a clip vertex, or clip distances are in use, then we must use the
fallback discard-based clipping from the frag shader.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
8 years agoa3xx: make sure to actually clamp depth as requested
Ilia Mirkin [Mon, 15 Aug 2016 03:58:18 +0000 (23:58 -0400)]
a3xx: make sure to actually clamp depth as requested

We were previously ... not clamping. I guess this meant that everything
got clamped to 1/0, which was enough to pass the existing tests. Or
perhaps the clamping would only happen to the rasterized depth value and
not the frag shader's output depth value.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
8 years agonvc0/ir: allow min/max instructions to be dual-issued in pairs
Karol Herbst [Sat, 13 Aug 2016 09:54:52 +0000 (11:54 +0200)]
nvc0/ir: allow min/max instructions to be dual-issued in pairs

changes for GpuTest /test=pixmark_piano /benchmark /no_scorebox /msaa=0
/benchmark_duration_ms=60000 /width=1024 /height=640:

inst_executed: 1.03G
inst_issued1: 614M -> 580M
inst_issued2: 213M -> 230M

score: 1021 -> 1030

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agoanv: Move cmd_buffer_config_l3 into anv_cmd_buffer.c
Jason Ekstrand [Tue, 23 Aug 2016 00:13:51 +0000 (17:13 -0700)]
anv: Move cmd_buffer_config_l3 into anv_cmd_buffer.c

This is the only remaining part of genX_l3.c and there's really no good
reason for it to be in its own file.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
8 years agoanv/cmd_buffer: Move emit_lri and emit_lrm higher up
Jason Ekstrand [Tue, 23 Aug 2016 00:13:27 +0000 (17:13 -0700)]
anv/cmd_buffer: Move emit_lri and emit_lrm higher up

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
8 years agoanv: Refactor pipeline l3 config setup
Jason Ekstrand [Mon, 22 Aug 2016 23:56:48 +0000 (16:56 -0700)]
anv: Refactor pipeline l3 config setup

Now that we're using gen_l3_config.c, we no longer have one set of l3
config functions per gen and we can simplify a bit.  Also, we know that
only compute uses SLM so we don't need to look for it in all of the stages.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
8 years agoanv: Leverage the shared L3$ config code
Jason Ekstrand [Mon, 22 Aug 2016 23:39:05 +0000 (16:39 -0700)]
anv: Leverage the shared L3$ config code

When Jordan first implement L3$ configuration for Vulkan, he copied+pasted
from the GL driver because we had no good place to share it.  Now that we
have src/intel/common, we should be sharing these tables.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
8 years agointel: Pull the guts of gen7_l3_state.c into a shared helper
Jason Ekstrand [Mon, 22 Aug 2016 23:09:32 +0000 (16:09 -0700)]
intel: Pull the guts of gen7_l3_state.c into a shared helper

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
8 years agointel: Rename brw_get_device_name/info to gen_get_device_name/info
Jason Ekstrand [Thu, 25 Aug 2016 23:22:58 +0000 (16:22 -0700)]
intel: Rename brw_get_device_name/info to gen_get_device_name/info

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
8 years agointel: s/brw_device_info/gen_device_info/
Jason Ekstrand [Mon, 22 Aug 2016 22:01:08 +0000 (15:01 -0700)]
intel: s/brw_device_info/gen_device_info/

Generated by:

sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/**/*.c
sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/**/*.h
sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.c
sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.cpp
sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.h

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
8 years agointel: Add a new "common" library for more code sharing
Jason Ekstrand [Mon, 22 Aug 2016 21:47:55 +0000 (14:47 -0700)]
intel: Add a new "common" library for more code sharing

The first thing to go in this new library is brw_device_info.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
8 years agointel/blorp: fix typo in android makefile
Mauro Rossi [Sat, 3 Sep 2016 09:00:25 +0000 (11:00 +0200)]
intel/blorp: fix typo in android makefile

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agonir: remove unused variable
Timothy Arceri [Sat, 3 Sep 2016 10:30:19 +0000 (20:30 +1000)]
nir: remove unused variable

This was let over from aad4f15506c

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
8 years agonir: remove some fields from nir_shader_compiler_options
Connor Abbott [Sat, 3 Sep 2016 04:49:58 +0000 (00:49 -0400)]
nir: remove some fields from nir_shader_compiler_options

I accidentally added these with 0dc4cab. Oops!

8 years agonir: fix bug with moves in nir_opt_remove_phis()
Connor Abbott [Fri, 2 Sep 2016 23:07:57 +0000 (19:07 -0400)]
nir: fix bug with moves in nir_opt_remove_phis()

In 144cbf8 ("nir: Make nir_opt_remove_phis see through moves."), Ken
made nir_opt_remove_phis able to coalesce phi nodes whose sources are
all moves with the same swizzle. However, he didn't add the logic
necessary for handling the fact that the phi may now have multiple
different sources, even though the sources point to the same thing. For
example, if we had something like:

if (...)
   a1 = b.yx;
else
   a2 = b.yx;
a = phi(a1, a2)
... = a

then we would rewrite it to

if (...)
   a1 = b.yx;
else
   a2 = b.yx;
... = a1

by picking a random phi source, which in this case is invalid because
the source doesn't dominate the phi. Instead, we need to change it to:

if (...)
   a1 = b.yx;
else
   a2 = b.yx;
a3 = b.yx;
... = a3;

Fixes 12 CTS tests:
ES31-CTS.functional.tessellation.invariance.outer_edge_symmetry.quads*

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agonir: add nir_after_phis() cursor helper
Connor Abbott [Fri, 2 Sep 2016 23:06:52 +0000 (19:06 -0400)]
nir: add nir_after_phis() cursor helper

And re-implement nir_after_cf_node_and_phis() using it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoglsl: expose max atomic counter/buffer consts for tess in ES 3.2
Ilia Mirkin [Sun, 28 Aug 2016 19:04:00 +0000 (15:04 -0400)]
glsl: expose max atomic counter/buffer consts for tess in ES 3.2

Curiously OES/EXT_tessellation_shader leave these out, while ES 3.2 adds
them in.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agomapi: don't forget to expose GetPointerv in GL ES 3.2
Ilia Mirkin [Sun, 28 Aug 2016 18:52:10 +0000 (14:52 -0400)]
mapi: don't forget to expose GetPointerv in GL ES 3.2

I left this out of my previous commit that went around enabling all of
the other ES 3.2 entrypoints.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agomain: add KHR_robustness to ES 3.2 extension requirements
Ilia Mirkin [Mon, 29 Aug 2016 00:39:07 +0000 (20:39 -0400)]
main: add KHR_robustness to ES 3.2 extension requirements

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agonv50,nvc0: respect render condition enable flag when clearing rt/zs
Ilia Mirkin [Sat, 3 Sep 2016 03:57:06 +0000 (23:57 -0400)]
nv50,nvc0: respect render condition enable flag when clearing rt/zs

This is a newly added flag. We always pass false into it from
nv50_clear_texture, but other callers may want to respect the render
condition. (And the functions were originally spec'd to respect it.)

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agonvc0/ir: don't dual-issue ops that depend or interfere with each other
Karol Herbst [Sat, 13 Aug 2016 09:54:45 +0000 (11:54 +0200)]
nvc0/ir: don't dual-issue ops that depend or interfere with each other

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
[imirkin: rewrite to split up the helpers and move more logic to target]
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agonir: Remove fake edges in the CF handling code
Jason Ekstrand [Wed, 31 Aug 2016 21:45:08 +0000 (14:45 -0700)]
nir: Remove fake edges in the CF handling code

When NIR was first introduced, Connor added this fake-edge hack to work
around issues related to unreachable blocks.  Thanks to GLSL IR's jump
lowering code, the only unreachable code you can have is a block after an
infinite loop.  With SPIR-V, we didn't have the jump lowering code so we
could also end up with the "if (...) { break; } else { continue; }" case
which generates an unreachable block after the if.  Because of this, most
of NIR had to be fixed up for handling unreachable blocks.  The only
remaining case of not handling unreachable blocks was specifically the
block-after-infinite-loop case in dead_cf which was fixed by the previous
commit.  We can now delete the fake edge hack.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
8 years agonir/dead_cf: Don't crash on unreachable after-loop blocks
Jason Ekstrand [Wed, 31 Aug 2016 23:35:21 +0000 (16:35 -0700)]
nir/dead_cf: Don't crash on unreachable after-loop blocks

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
8 years agonvc0: reduce the initial code segment size to 512KB
Samuel Pitoiset [Wed, 31 Aug 2016 20:52:47 +0000 (22:52 +0200)]
nvc0: reduce the initial code segment size to 512KB

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agonvc0: allow to resize the code segment dynamically
Samuel Pitoiset [Wed, 31 Aug 2016 20:52:46 +0000 (22:52 +0200)]
nvc0: allow to resize the code segment dynamically

When an application uses a ton of shaders, we need to evict them
when the code segment is full but this is not really a good solution
if monster shaders are used because code eviction will happen a lot.

To avoid this, it seems better to dynamically resize the code
segment area after each eviction. The maximum size is arbitrary
fixed to 8MB which should be enough.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agonvc0: add a new bin for the code segment
Samuel Pitoiset [Wed, 31 Aug 2016 20:52:45 +0000 (22:52 +0200)]
nvc0: add a new bin for the code segment

To avoid the bins list to grow up indefinitely when the code segment
size will be bumped, we need to separate that bin from the SCREEN
one because it contains other resources like the uniform bo.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agonvc0: add nvc0_screen_resize_text_area() helper
Samuel Pitoiset [Wed, 31 Aug 2016 20:52:44 +0000 (22:52 +0200)]
nvc0: add nvc0_screen_resize_text_area() helper

This function will be helpful for resizing the code segment
area when we need to evict all shaders.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agonvc0: re-upload currently bound shaders after code eviction
Samuel Pitoiset [Wed, 31 Aug 2016 20:52:43 +0000 (22:52 +0200)]
nvc0: re-upload currently bound shaders after code eviction

This fixes a very old issue which happens when the code segment
size is full. A bunch of real applications like Tomb Raider,
F1 2015, Elemental, hit that issue because they use a ton of shaders.

In this case, all shaders are evicted (for freeing space) but all
currently bound shaders also need to be re-uploaded and SP_START_ID
have to be updated accordingly.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agonvc0: refactor the program upload process
Samuel Pitoiset [Wed, 31 Aug 2016 20:52:42 +0000 (22:52 +0200)]
nvc0: refactor the program upload process

This refactoring will help for fixing the "out of code space"
eviction issue because we will need to reupload the code for
all currently bound shaders but it's slightly different than
uploading a new fresh code.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agoi965: fix noop_scissor range issue on width/height
Jordan Justen [Fri, 18 Oct 2013 22:03:09 +0000 (15:03 -0700)]
i965: fix noop_scissor range issue on width/height

If scissor X or Y was set to a negative value then the previous
code might have indicated noop scissors when the scissor range
actually was masking a portion of the framebuffer.

Since fb->_Xmin, _Xmax, _Ymin and _Ymax take scissors into
account, we can use these to test for a noop scissor.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
8 years agoglsl: Only force varyings to be flat when varying packing.
Kenneth Graunke [Wed, 31 Aug 2016 05:51:25 +0000 (22:51 -0700)]
glsl: Only force varyings to be flat when varying packing.

Varying packing would like to mark certain variables as flat.
This works as long as both sides of the interfaces are changed
accordingly.  However, with SSO, we disable varying packing on
the outermost stages.  We also disable varying packing for
certain tessellation stages.

With SSO, we operate on the producer and consumer separately.
Checks based on the consumer stage and variable are risky, and
can easily lead to altering one half of the interface between
stages, breaking SSO pipeline IO validation.

Just stop monkeying around with interpolation modes unless
required for varying packing.  There's no point.  This also
disables it in unsafe SSO cases.

Fixes CTS tests:
*.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_MaxPatchVertices_Position_PointSize

Also fixes Piglit's spec/oes_geometry_shader/sso_validation:
- user-defined-gs-input-not-in-block.shader_test
- user-defined-gs-input-in-block.shader_test

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agoglsl: Reject TCS/TES input arrays not sized to gl_MaxPatchVertices.
Kenneth Graunke [Wed, 31 Aug 2016 07:16:24 +0000 (00:16 -0700)]
glsl: Reject TCS/TES input arrays not sized to gl_MaxPatchVertices.

We handled the unsized case, implicitly sizing arrays to the value
of gl_MaxPatchVertices.  But if a size was present, we failed to
raise a compile error if it wasn't the value of gl_MaxPatchVertices.

Fixes CTS tests:

  *.tessellation_shader.compilation_and_linking_errors.
  {tc,te}_invalid_array_size_used_for_input_blocks

Piglit's tcs-input-read-nonconst-* tests have recently been fixed.
This patch will break older copies of those tests, but the latest
should continue working.  Update to Piglit 75819c13af2ed5.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
8 years agowayland-drm: add missing NULL check
Frank Binns [Thu, 4 Aug 2016 14:15:40 +0000 (15:15 +0100)]
wayland-drm: add missing NULL check

Although malloc is unlikely to fail check its return value nevertheless.

Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
8 years agoloader: fix sysfs uevent file parsing
Frank Binns [Thu, 4 Aug 2016 14:15:23 +0000 (15:15 +0100)]
loader: fix sysfs uevent file parsing

When trying to get a device name for an fd using sysfs, it would always fail
as it was expecting key/value pairs to be delimited by '\0', which is not the
case.

Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
8 years agoegl: only store device name when Wayland support is built
Frank Binns [Fri, 17 Jun 2016 17:41:22 +0000 (18:41 +0100)]
egl: only store device name when Wayland support is built

The device name is only needed for WL_bind_wayland_display so make this clear
by only storing the device name when Wayland support is built.

Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
8 years agoisl: round format alignment to nearest power of 2
Lionel Landwerlin [Fri, 19 Aug 2016 23:38:05 +0000 (00:38 +0100)]
isl: round format alignment to nearest power of 2

A few inline asserts in anv assume alignments are power of 2, but with
formats like R8G8B8 we have odd alignments.

v2: round up to power of 2 (Ilia)

v3: reuse util_next_power_of_two() from gallium/aux/util/u_math.h (Ilia)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agogallium/postprocess: Fix resource freeing
Thomas Hellstrom [Thu, 26 May 2016 09:16:28 +0000 (11:16 +0200)]
gallium/postprocess: Fix resource freeing

The code was triggering asserts in DEBUG builds of the SVGA driver since
the reference count of the resource was never decremented before destroy.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
8 years agost/mesa: expose OES_geometry_shader and OES_texture_cube_map_array
Ilia Mirkin [Sat, 27 Aug 2016 21:47:37 +0000 (17:47 -0400)]
st/mesa: expose OES_geometry_shader and OES_texture_cube_map_array

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agoIntroduce .editorconfig
Eric Engestrom [Tue, 30 Aug 2016 20:02:18 +0000 (21:02 +0100)]
Introduce .editorconfig

A few weeks ago, Jose Fonseca suggested [0] we use .editorconfig files
to try and enforce the formatting of the code, to which Michel Dänzer
suggested [1] we start by importing the existing .dir-locals.el
settings. The first draft was discussed in the RFC [2].

These .editorconfig are a first step, one that has the advantage of
requiring little to no intervention from the devs once the settings
files are in place, but the settings are very limited. This does have
the advantage of applying while the code is being written.
This doesn't replace the need for more comprehensive formatting tools
such as clang-format & clang-tidy, but those reformat the code after
the fact.

[0] https://lists.freedesktop.org/archives/mesa-dev/2016-June/121545.html
[1] https://lists.freedesktop.org/archives/mesa-dev/2016-June/121639.html
[2] https://lists.freedesktop.org/archives/mesa-dev/2016-July/123431.html

Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
8 years agovc4: Add missing break statement.
Eric Anholt [Mon, 29 Aug 2016 18:23:35 +0000 (11:23 -0700)]
vc4: Add missing break statement.

This opcode isn't used yet, so it didn't affect anything.  Caught by
Coverity, reported to me by imirkin.

8 years agogallium/docs: clarify render_condition_enabled parameter to clear functions
Brian Paul [Wed, 31 Aug 2016 16:03:53 +0000 (10:03 -0600)]
gallium/docs: clarify render_condition_enabled parameter to clear functions

If false, it means do the clear unconditionally.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agomesa: Add some more .gitignore
Jason Ekstrand [Wed, 31 Aug 2016 20:45:27 +0000 (13:45 -0700)]
mesa: Add some more .gitignore

8 years agoi965: Pass start_offset to brw_set_uip_jip().
Matt Turner [Mon, 29 Aug 2016 22:57:41 +0000 (15:57 -0700)]
i965: Pass start_offset to brw_set_uip_jip().

Without this, we would pass over the instructions in the SIMD8 program
(which is located earlier in the buffer) when brw_set_uip_jip() is
called to handle the SIMD16 program.

The assertion about compacted control flow was bogus: halt, cont, break
cannot be compacted because they have both JIP and UIP. Instead, we
should never see a compacted instruction in this code at all.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
8 years agoi965: Merge gen7_clip_state atom into gen6_clip_state atom.
Kenneth Graunke [Tue, 30 Aug 2016 20:16:02 +0000 (13:16 -0700)]
i965: Merge gen7_clip_state atom into gen6_clip_state atom.

The original motivation was that gen6_clip_state ignored _NEW_POLYGON
as it didn't care about early culling.  The only other change was that
Gen6 ignored BRW_NEW_TES_PROG_DATA as it doesn't have tessellation
shaders, but listening to this is harmless as it'll never be signalled.

Now that we've added _NEW_POLYGON for is_drawing_lines/points, we can
merge the two as the distinction is meaningless.

This actually fixes a bug, though: Gen8+ was using the gen6_clip_state
atom because it doesn't care about early culling, but it also needs
BRW_NEW_TES_PROG_DATA, which was missing.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
8 years agoi965: Use gs_prog_data in is_drawing_points/lines().
Kenneth Graunke [Fri, 26 Aug 2016 05:52:22 +0000 (22:52 -0700)]
i965: Use gs_prog_data in is_drawing_points/lines().

State upload code should use prog_data rather than poking at core
Mesa shader data structures wherever possible.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
8 years agoi965: Fix missing dirty bits related to is_drawing_points/lines.
Kenneth Graunke [Fri, 26 Aug 2016 06:00:13 +0000 (23:00 -0700)]
i965: Fix missing dirty bits related to is_drawing_points/lines.

calculate_attr_overrides() uses is_drawing_points(), which depends
on tessellation and geometry program state, as well as polygon state.

v2: Add missing _NEW_POLYGON as well.  Caught by Iago Toral.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
8 years agonvc0: remove an attempt at uploading all IMMD into a CB
Samuel Pitoiset [Wed, 31 Aug 2016 15:42:05 +0000 (17:42 +0200)]
nvc0: remove an attempt at uploading all IMMD into a CB

This has never been used because info->immd.bufSize is always 0
and anyways this is an experimental code which has never been
completed.

This gets rid of some unused code in the program validation process.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agonv50: remove unused nv50_program::immd_size field
Samuel Pitoiset [Wed, 31 Aug 2016 15:42:04 +0000 (17:42 +0200)]
nv50: remove unused nv50_program::immd_size field

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agonv30: set usage to staging so that the buffer is allocated in GART
Ilia Mirkin [Wed, 31 Aug 2016 06:12:08 +0000 (02:12 -0400)]
nv30: set usage to staging so that the buffer is allocated in GART

The code a few lines below expects to migrate the bo in question to
VRAM. Since we're filling the initial data via CPU, it's more efficient
to create the temporary buffer in GART. There is no "push" method
implemented, otherwise we'd use that instead.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
8 years agoegl/x11_dri3: provide an authentication function
Frank Binns [Fri, 17 Jun 2016 17:41:21 +0000 (18:41 +0100)]
egl/x11_dri3: provide an authentication function

To support WL_bind_wayland_display an authentication function needs to be
provided but this was not being done for this platform as it's not strictly
necessary. However, as this isn't an optional function there's the potential
for a segfault to occur if authentication is mistakenly performed. Protect
against this by providing a function that prints an error.

Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
8 years agoegl/x11_dri3: disable WL_bind_wayland_display for devices without render nodes
Frank Binns [Fri, 17 Jun 2016 17:41:20 +0000 (18:41 +0100)]
egl/x11_dri3: disable WL_bind_wayland_display for devices without render nodes

Up until now, DRI3 was only used for devices that have render nodes, unless
overridden via an environment variable, with it falling back to DRI2 otherwise.
This limitation was there in order to support WL_bind_wayland_display as it
requires client opened device node fds to be authenticated, which isn't possible
when using DRI3. This is an unfortunate compromise as DRI3 provides security
benefits over DRI2.

Instead, allow DRI3 to be used for devices without render nodes but don't
advertise WL_bind_wayland_display in this case. Applications that need this
extension can still be run by disabling DRI3 support via the LIBGL_DRI3_DISABLE
environment variable.

Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
8 years agoscons: Fix MinGW cross compilation.
Jose Fonseca [Wed, 31 Aug 2016 11:16:32 +0000 (12:16 +0100)]
scons: Fix MinGW cross compilation.

The generated GLSL header files were only being built for the host
platform, and not the target platform.

Trivial.

8 years agonv30: only bail on color/depth bpp mismatch when surfaces are swizzled
Ilia Mirkin [Wed, 31 Aug 2016 04:54:17 +0000 (00:54 -0400)]
nv30: only bail on color/depth bpp mismatch when surfaces are swizzled

The actual restriction is a little weaker than I originally thought. See
https://bugs.freedesktop.org/show_bug.cgi?id=92306#c17 for the
suggestion. This also explain why things weren't *always* failing
before, only sometimes. We will allocate a non-swizzled depth buffer for
NPOT winsys buffer sizes, which they almost always are.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
8 years agoglsl: Handle patch qualifier on interface blocks.
Kenneth Graunke [Thu, 2 Jun 2016 02:27:02 +0000 (19:27 -0700)]
glsl: Handle patch qualifier on interface blocks.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agoi965: enable OES_primitive_bounding_box with the no-op implementation
Ilia Mirkin [Sun, 28 Aug 2016 20:03:21 +0000 (16:03 -0400)]
i965: enable OES_primitive_bounding_box with the no-op implementation

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
8 years agost/mesa: provide the null implementation of bounding box outputs in tcs
Ilia Mirkin [Mon, 30 May 2016 17:28:02 +0000 (13:28 -0400)]
st/mesa: provide the null implementation of bounding box outputs in tcs

Until hardware appears (in a gallium driver) that can make use of the
TCS-outputted gl_BoundingBox, we just request that the variable gets
assigned as a regular patch variable.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agoglsl: add gl_BoundingBox and associated varying slots
Ilia Mirkin [Mon, 30 May 2016 15:50:07 +0000 (11:50 -0400)]
glsl: add gl_BoundingBox and associated varying slots

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agomesa: add support for GL_PRIMITIVE_BOUNDING_BOX storage and query
Ilia Mirkin [Mon, 30 May 2016 16:54:23 +0000 (12:54 -0400)]
mesa: add support for GL_PRIMITIVE_BOUNDING_BOX storage and query

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agomesa: add scaffolding for OES/EXT_primitive_bounding_box
Ilia Mirkin [Mon, 30 May 2016 15:49:26 +0000 (11:49 -0400)]
mesa: add scaffolding for OES/EXT_primitive_bounding_box

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agodocs: add GL_OES_viewport_array to features
Ilia Mirkin [Tue, 30 Aug 2016 23:46:31 +0000 (19:46 -0400)]
docs: add GL_OES_viewport_array to features

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agoaubinator: fix if indentation and add brackets to multiline body
Timothy Arceri [Mon, 29 Aug 2016 23:53:35 +0000 (09:53 +1000)]
aubinator: fix if indentation and add brackets to multiline body

Fixes misleading indentation warning in gcc.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agoi965/fs: Assert that the number of color targets is one when dual-source blend is...
Francisco Jerez [Fri, 26 Aug 2016 01:35:06 +0000 (18:35 -0700)]
i965/fs: Assert that the number of color targets is one when dual-source blend is enabled.

Requested by Anuj during review of
4a87e4ade778e56d43333c65a58752b15a00ce69, adding as follow-up since it
led to assertion failures due to various GLSL bugs that should be
fixed now.

8 years agoglsl: Fix gl_program::OutputsWritten computation for dual-source blending.
Francisco Jerez [Sat, 20 Aug 2016 21:55:19 +0000 (14:55 -0700)]
glsl: Fix gl_program::OutputsWritten computation for dual-source blending.

In the fragment shader OutputsWritten is a bitset of FRAG_RESULT_*
enumerants, which represent the location of each color output written
by the shader.  The secondary and primary color outputs of a given
render target using dual-source blending have the same location, so
the 'idx' computation below will give the wrong bit as result if the
'var->data.index' term is non-zero -- E.g. if the shader writes the
primary and secondary colors of the FRAG_RESULT_COLOR output,
ir_set_program_inouts will think that the shader writes both
FRAG_RESULT_COLOR and FRAG_RESULT_SAMPLE_MASK, which is just bogus.

That would cause the brw_wm_prog_key::nr_color_regions computation
done in the i965 driver during fragment shader precompilation to be
wrong, which currently leads to unnecessary recompilation of shaders
that use dual-source blending, and triggers an assertion failure in
fs_visitor::emit_fb_writes() on my i965-fb-fetch branch.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agoglsl: Fix incorrect hard-coded location of the gl_SecondaryFragColorEXT built-in.
Francisco Jerez [Thu, 23 Jun 2016 07:05:37 +0000 (00:05 -0700)]
glsl: Fix incorrect hard-coded location of the gl_SecondaryFragColorEXT built-in.

gl_SecondaryFragColorEXT should have the same location as gl_FragColor
for the secondary fragment color to be replicated to all fragment
outputs.  The incorrect location of gl_SecondaryFragColorEXT would
cause the linker to mark both FRAG_RESULT_COLOR and FRAG_RESULT_DATA0
as being written to, which isn't allowed by the spec and would
ultimately lead to an assertion failure in
fs_visitor::emit_fb_writes() on my i965-fb-fetch branch.

This should also fix the code below for multiple dual-source-blended
render targets, which no driver currently supports but we have plans
to enable eventually in the i965 driver (the comment saying that no
hardware will ever support it seems rather hilarious).

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agost/glsl_to_tgsi: Use SecondaryOutputsWritten to determine dual-source fragment outputs.
Francisco Jerez [Tue, 23 Aug 2016 18:18:19 +0000 (11:18 -0700)]
st/glsl_to_tgsi: Use SecondaryOutputsWritten to determine dual-source fragment outputs.

Currently the mesa state tracker relies on there being two bits set
per dual-source output in the gl_program::OutputsWritten bitset, but
that only worked due to a GLSL front-end bug that caused it to set the
OutputsWritten bit for both location and location+1 even though at the
GLSL level the primary and secondary color outputs used for
dual-source blending have the same location.  Fix it by extending
outputMapping[] to 2*FRAG_RESULT_MAX elements in order to represent a
mapping from a (location, index) pair to its TGSI output, which should
also make it slightly easier to add support for dual-source blending
in combination with multiple render targets in the long run.

No Piglit regressions on llvmpipe.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agoglsl: Calculate bitset of secondary outputs written in ir_set_program_inouts.
Francisco Jerez [Tue, 23 Aug 2016 18:15:57 +0000 (11:15 -0700)]
glsl: Calculate bitset of secondary outputs written in ir_set_program_inouts.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agoglsl: Fix typo in comment
Ian Romanick [Tue, 19 Jul 2016 22:45:03 +0000 (15:45 -0700)]
glsl: Fix typo in comment

Trivial.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agoglsl: Replace most assertions with unreachable()
Ian Romanick [Tue, 19 Jul 2016 00:38:19 +0000 (17:38 -0700)]
glsl: Replace most assertions with unreachable()

   text    data     bss     dec     hex filename
7669233  277176   28624 7975033  79b079 i965_dri.so before generated code
7647081  277176   28624 7952881  7959f1 i965_dri.so before this commit
7669289  277176   28624 7975089  79b0b1 i965_dri.so with this commit

Looking at the generated assembly, it appears that some of changes made
in the generated code prevent some loops from being unrolled.  Removing
the default cases (via unreachable()) allows these loops to unroll again.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
8 years agoglsl: Refactor handling of horizontal operations
Ian Romanick [Mon, 18 Jul 2016 18:16:18 +0000 (11:16 -0700)]
glsl: Refactor handling of horizontal operations

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
8 years agoglsl: Use constant_template_horizontal instead of constant_template_horizontal_single...
Ian Romanick [Mon, 18 Jul 2016 18:13:55 +0000 (11:13 -0700)]
glsl: Use constant_template_horizontal instead of constant_template_horizontal_single_implementation for unops

This changes the "shape" of all the pack and unpack operators, but they
should function the same.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
8 years agoglsl: Eliminate constant_template2
Ian Romanick [Mon, 18 Jul 2016 17:58:42 +0000 (10:58 -0700)]
glsl: Eliminate constant_template2

constant_template_common can now handle the case where the result type
is different from the input type by using type_signature_iter.  This
changes the "shape" of all the cast-style operators, but they should
function the same.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agoglsl: Eliminate constant_template5
Ian Romanick [Mon, 18 Jul 2016 17:49:07 +0000 (10:49 -0700)]
glsl: Eliminate constant_template5

constant_template_common can now handle the case where the result type
is different from the input type by using type_signature_iter.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
8 years agoglsl: Eliminate constant_template0
Ian Romanick [Fri, 15 Jul 2016 23:40:06 +0000 (16:40 -0700)]
glsl: Eliminate constant_template0

This template is mostly an artefact of the development of the original
patch series and to minimize the differences between the original code
and the generated code.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
8 years agoglsl: Eliminate one of the templates for simpler operations
Ian Romanick [Fri, 15 Jul 2016 00:42:59 +0000 (17:42 -0700)]
glsl: Eliminate one of the templates for simpler operations

The difference between these two templates were mostly an artefact of
the development of the original patch series and to minimize the
differences between the original code and the generated code.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>