mesa.git
9 years agoradeonsi: simplify accessing alpha pointer in si_llvm_emit_fs_epilogue
Marek Olšák [Sat, 28 Feb 2015 16:16:57 +0000 (17:16 +0100)]
radeonsi: simplify accessing alpha pointer in si_llvm_emit_fs_epilogue

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agoradeonsi: add support for easy opcodes from ARB_gpu_shader5
Marek Olšák [Fri, 13 Mar 2015 15:21:11 +0000 (16:21 +0100)]
radeonsi: add support for easy opcodes from ARB_gpu_shader5

I have to use the BFE instrinsics, because BFE is one of the most complex
instructions that can't be matched easily. BFE has 3 conditional branches
and one of them is quite big.

In the isel DAG, lowered BFE has 27 nodes (including leafs).

9 years agoradeonsi: implement bit-finding opcodes from ARB_gpu_shader5
Marek Olšák [Sat, 28 Feb 2015 13:01:43 +0000 (14:01 +0100)]
radeonsi: implement bit-finding opcodes from ARB_gpu_shader5

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
9 years agoradeonsi: implement gl_SampleMaskIn
Marek Olšák [Fri, 27 Feb 2015 23:30:26 +0000 (00:30 +0100)]
radeonsi: implement gl_SampleMaskIn

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
9 years agoradeonsi: add support for SQRT
Marek Olšák [Mon, 2 Mar 2015 01:40:57 +0000 (02:40 +0100)]
radeonsi: add support for SQRT

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
9 years agoradeonsi: add support for FMA
Marek Olšák [Fri, 27 Feb 2015 23:44:19 +0000 (00:44 +0100)]
radeonsi: add support for FMA

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
9 years agogallium/radeon: don't use LLVMReadOnlyAttribute for ALU
Marek Olšák [Fri, 27 Feb 2015 17:39:40 +0000 (18:39 +0100)]
gallium/radeon: don't use LLVMReadOnlyAttribute for ALU

None of the instructions use a pointer argument.
(+ small cosmetic changes)

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
9 years agotgsi: handle bitwise opcodes in tgsi_opcode_infer_type (v2)
Marek Olšák [Fri, 27 Feb 2015 23:34:53 +0000 (00:34 +0100)]
tgsi: handle bitwise opcodes in tgsi_opcode_infer_type (v2)

v2: set the same types as the destination type in tgsi_exec

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agogallium: add FMA and DFMA opcodes (v3)
Marek Olšák [Fri, 27 Feb 2015 23:26:31 +0000 (00:26 +0100)]
gallium: add FMA and DFMA opcodes (v3)

Needed by ARB_gpu_shader5.

v2: select DMAD for FMA with double precision
v3: add and select DFMA

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agofreedreno: update generated headers
Rob Clark [Sun, 15 Mar 2015 21:59:01 +0000 (17:59 -0400)]
freedreno: update generated headers

Fix a3xx texture layer-size.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
9 years agofreedreno/ir3: remove old compiler
Rob Clark [Wed, 11 Mar 2015 19:10:25 +0000 (15:10 -0400)]
freedreno/ir3: remove old compiler

Now that piglit is no longer falling back to old compiler for any tests,
we can remove it.  Hurray \o/

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno/ir3: avoid scheduler deadlock
Rob Clark [Wed, 11 Mar 2015 17:21:42 +0000 (13:21 -0400)]
freedreno/ir3: avoid scheduler deadlock

Deadlock can occur if we schedule an address register write, yet some
instructions which depend on that address register value also depend on
other unscheduled instructions that depend on a different address
register value.  To solve this, before scheduling an address register
write, ensure that all the other dependencies of the instructions which
consume this address register are already scheduled.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno/ir3: bit of cleanup
Rob Clark [Wed, 11 Mar 2015 16:36:26 +0000 (12:36 -0400)]
freedreno/ir3: bit of cleanup

Add an array_insert() macro to simplify inserting into dynamically sized
arrays, add a comment, and remove unused prototype inherited from the
original freedreno.git/fdre-a3xx test code, etc.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agoi965: De-duplicate is_expression_commutative() functions.
Kenneth Graunke [Fri, 13 Mar 2015 21:34:06 +0000 (14:34 -0700)]
i965: De-duplicate is_expression_commutative() functions.

Create a backend_inst::is_commutative() method to replace two static
functions that did the exact same thing.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
9 years agoi965/gen4-5: Cope with immutable-format texture revalidation
Chris Forbes [Mon, 8 Dec 2014 07:37:00 +0000 (20:37 +1300)]
i965/gen4-5: Cope with immutable-format texture revalidation

This is unfortunately sometimes necessary due to rebasing levels when
rendering into them.

16 piglits crash -> pass, when building mesa with debug enabled.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agodocs: add news item and link release notes for mesa 10.5.1
Emil Velikov [Fri, 13 Mar 2015 23:36:33 +0000 (23:36 +0000)]
docs: add news item and link release notes for mesa 10.5.1

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
9 years agodocs: Add sha256 sums for the 10.5.1 release
Emil Velikov [Fri, 13 Mar 2015 23:32:12 +0000 (23:32 +0000)]
docs: Add sha256 sums for the 10.5.1 release

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 2abba086ca84f200fae940129c0a5342c3748f00)

9 years agoAdd release notes for the 10.5.1 release
Emil Velikov [Fri, 13 Mar 2015 22:32:57 +0000 (22:32 +0000)]
Add release notes for the 10.5.1 release

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 11c0ff60ef19cca84452aa989fb8bb25127473e0)

9 years agofreedreno: fix slice pitch calculations
Ilia Mirkin [Fri, 13 Mar 2015 05:36:57 +0000 (01:36 -0400)]
freedreno: fix slice pitch calculations

For example if width were 65, the first slice would get 96 while the
second would get 32. However the hardware appears to expect the second
pitch to be 64, based on halving the 96 (and aligning up to 32).

This fixes texelFetch piglit tests on a3xx below a certain size. Going
higher they break again, but most likely due to unrelated reasons.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno/a3xx: use the same layer size for all slices
Ilia Mirkin [Fri, 13 Mar 2015 04:53:49 +0000 (00:53 -0400)]
freedreno/a3xx: use the same layer size for all slices

We only program in one layer size per texture, so that means that all
levels must share one size. This makes the piglit test

bin/texelFetch fs sampler2DArray

have the same breakage as its non-array version instead of being
completely off, and makes

bin/ext_texture_array-gen-mipmap

start passing.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
9 years agoi965/vs: Add missing resolve_bool_comparison calls on GEN4 and GEN5
Ian Romanick [Wed, 25 Feb 2015 01:57:18 +0000 (20:57 -0500)]
i965/vs: Add missing resolve_bool_comparison calls on GEN4 and GEN5

The ir_unop_any problem was discovered by some later optimization passes
that generate ir_triop_csel.  I was also able to reproduce it by
modifying the gl-2.0-vertexattribpointer vertex shader to generate its
result using

   color = mix(vec4(0, 1, 0, 0),
               vec4(1, 0, 0, 0),
               bvec4(any(greaterThan(diff, vec4(tolerance)))));

instead of an if-statement.  This also required using #version 130 and
MESA_GLSL_VERSION_OVERRIDE=130.

I have not nominated this for stable releases because I don't think
there's any way to trigger the problem without GLSL 1.30 or
optimizations that don't exist in stable.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@intel.com>
9 years agoi965/disasm: Fix format strings
Chris Forbes [Fri, 13 Mar 2015 18:10:11 +0000 (07:10 +1300)]
i965/disasm: Fix format strings

Most of the brw_inst_* api returns 64bit values. This fixes disassembly
of sampler messages, etc.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agoi965/disasm: Mark format() as being printf-style.
Chris Forbes [Fri, 13 Mar 2015 18:10:10 +0000 (07:10 +1300)]
i965/disasm: Mark format() as being printf-style.

This allows us to get warnings from GCC when we mess up the format
strings.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agodocs: List ARB_shading_language_packing/EXT_shader_integer_mix.
Matt Turner [Thu, 12 Mar 2015 01:43:56 +0000 (18:43 -0700)]
docs: List ARB_shading_language_packing/EXT_shader_integer_mix.

Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
9 years agoglsl: Expose built-in packing functions under GLSL 4.2.
Matt Turner [Thu, 12 Mar 2015 01:14:28 +0000 (18:14 -0700)]
glsl: Expose built-in packing functions under GLSL 4.2.

ARB_shading_language_packing is part of GLSL 4.2, not 4.0 as I
mistakenly believed. The following functions are available only with
ARB_shading_language_packing, GLSL 4.2 (not GLSL 4.0), or ES 3.0:

   - packSnorm2x16
   - unpackSnorm2x16
   - packHalf2x16
   - unpackHalf2x16

Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
9 years agoegl: Create queryable strings in eglInitialize().
Matt Turner [Tue, 10 Mar 2015 18:41:57 +0000 (11:41 -0700)]
egl: Create queryable strings in eglInitialize().

Creating/recreating the strings in eglQueryString() is extra work and
isn't thread-safe, as exhibited by shader-db's run.c using libepoxy.

Multiple threads in run.c call eglReleaseThread() around the same time.
libepoxy calls eglQueryString() to determine whether eglReleaseThread()
exists, and our EGL implementation passes a pointer to the version
string to libepoxy while simultaneously overwriting the string, leading
to a failure in libepoxy.

Moreover, the EGL spec says (emphasis mine):

"eglQueryString returns a pointer to a *static*, zero-terminated string"

This patch moves some auxiliary functions from eglmisc.c to eglapi.c so
that they may be used to create the extension, API, and version strings
once during eglInitialize(). The auxiliary functions are renamed from
_eglUpdate* to _eglCreate*, and some checks made unnecessary by calling
the functions from eglInitialize() are removed.

Reviewed-by: Chad Versace <chad.versace@intel.com>
9 years agoglsl: optimize (0 cmp x + y) into (-x cmp y).
Samuel Iglesias Gonsalvez [Tue, 24 Feb 2015 18:02:57 +0000 (19:02 +0100)]
glsl: optimize (0 cmp x + y) into (-x cmp y).

The optimization done by commit 34ec1a24d did not take it into account.

Fixes:

dEQP-GLES3.functional.shaders.random.all_features.fragment.20

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
9 years agomesa: Check for valid PBO access in gl(Compressed)Tex(Sub)Image calls
Eduardo Lima Mitev [Thu, 12 Mar 2015 07:16:09 +0000 (08:16 +0100)]
mesa: Check for valid PBO access in gl(Compressed)Tex(Sub)Image calls

This patch adds two types of checks to the gl(Compressed)Tex(Sub)Imgage family
of functions when a pixel buffer object is bound to GL_PIXEL_UNPACK_BUFFER:

- That the buffer is not mapped.
- The total data size is within the boundaries of the buffer size.

It does so by calling auxiliary validations functions from PBO API:
_mesa_validate_pbo_source() for non-compressed texture calls, and
_mesa_validate_pbo_source_compressed() for compressed texture calls.

The first check is defined in Section 6.3.2 'Effects of Mapping Buffers
on Other GL Commands' of the GLES 3.1 spec, page 57:

    "Any GL command which attempts to read from, write to, or change the
     state of a buffer object may generate an INVALID_OPERATION error if all
     or part of the buffer object is mapped. However, only commands which
     explicitly describe this error are required to do so. If an error is not
     generated, using such commands to perform invalid reads, writes, or
     state changes will have undefined results and may result in GL
     interruption or termination."

Similar wording exists in GL 4.5 spec, page 76.

In the case of gl(Compressed)Tex(Sub)Image(2,3)D, the specification doesn't force
implemtations to throw an error. However since Mesa don't currently implement
checks to determine when it is safe to read/write from/to a mapped PBO, we
should always return the error if all or parts of it are mapped.

The 2nd check is defined in Section 8.5 'Texture Image Specification' of the
OpenGL 4.5 spec, page 203:

    "An INVALID_OPERATION error is generated if a pixel unpack buffer object
     is bound and storing texture data would access memory beyond the end of
     the pixel unpack buffer."

Fixes 4 dEQP tests:
* dEQP-GLES3.functional.negative_api.texture.compressedteximage2d_invalid_buffer_target
* dEQP-GLES3.functional.negative_api.texture.compressedtexsubimage2d_invalid_buffer_target
* dEQP-GLES3.functional.negative_api.texture.compressedteximage3d_invalid_buffer_target
* dEQP-GLES3.functional.negative_api.texture.compressedtexsubimage3d_invalid_buffer_target

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
9 years agomesa: Separate PBO validation checks from buffer mapping, to allow reuse
Eduardo Lima Mitev [Thu, 12 Mar 2015 07:14:03 +0000 (08:14 +0100)]
mesa: Separate PBO validation checks from buffer mapping, to allow reuse

Internal PBO functions such as _mesa_map_validate_pbo_source() and
_mesa_validate_pbo_compressed_teximage() perform validation and buffer mapping
within the same call.

This patch takes out the validation into separate functions to allow reuse
of functionality by other code (i.e, gl(Compressed)Tex(Sub)Image).

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
9 years agomesa: Set the correct image size in _mesa_validate_pbo_access()
Eduardo Lima Mitev [Thu, 5 Mar 2015 08:20:11 +0000 (09:20 +0100)]
mesa: Set the correct image size in _mesa_validate_pbo_access()

_mesa_validate_pbo_access() provides a generic way to check that a
requested pixel transfer operation on a PBO falls within the
boundaries of the buffer. It is used in various other places, and
depending on the caller, some arguments are used or not.

In particular, the 'clientMemSize' argument is used only by calls
that are knowledgeable of the total size of the user data involved
in a pixel transfer, such as the case of compressed texture image
calls. Other calls don't provide 'clientMemSize' directly since it
is made implicit from the size and format of the texture, and its
data type. In these cases, a sufficiently big value is passed to
'clientMemSize' (INT_MAX) to avoid an incorrect constrain.

The problem is that _mesa_validate_pbo_access() use uint
pointers to make the calculations, which are 64 bits long in 64
bits platforms, meanwhile the dummy INT_MAX passed in 'clientMemSize'
is just 32 bits. This causes a constrain that is not desired.

This patch fixes that by checking that if 'clientMemSize' is MAX_INT,
then UINTPTR_MAX is assumed instead.

This is an ugly workaround to the fact that _mesa_validate_pbo_access()
intends to be a one function fits all. The clean solution here would
be to break it into different functions that provide the adequate API
for each of the possible code paths and validation needs.

Since there are callers relying on passing INT_MAX to 'clientMemSize',
this patch is necessary to deal with the problem above while a cleaner
implementation of the PBO API is not implemented.

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
9 years agometa: Remove error checks for texture <-> pixel-buffer transfers that don't belong...
Eduardo Lima Mitev [Tue, 10 Mar 2015 18:33:30 +0000 (19:33 +0100)]
meta: Remove error checks for texture <-> pixel-buffer transfers that don't belong in driver code

The implementation of texture <-> pixel-buffer transfers in drivers common layer
includes certain error checks and argument validation that don't belong there,
considering how the Mesa codebase is laid out. These are higher level
validations that, if necessary, should be performed earlier (i.e, in GL API
entry points).

This patch simply removes these error checks from driver code.

For more information, see discussion at
http://lists.freedesktop.org/archives/mesa-dev/2015-February/077417.html.

Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
9 years agoutil: convert slab macros to inline functions
Brian Paul [Thu, 12 Mar 2015 21:50:20 +0000 (15:50 -0600)]
util: convert slab macros to inline functions

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
9 years agoegl: fix cast to silence compiler warning
Brian Paul [Thu, 12 Mar 2015 14:35:38 +0000 (08:35 -0600)]
egl: fix cast to silence compiler warning

eglcurrent.c: In function '_eglSetTSD':
eglcurrent.c:57:4: warning: passing argument 2 of 'tss_set' discards
'const' qualifier from pointer target type [enabled by default]
    tss_set(_egl_TSD, (const void *) t);
    ^
In file included from ../../../include/c11/threads.h:72:0,
                 from eglcurrent.c:32:
../../../include/c11/threads_posix.h:357:1: note: expected 'void *'
but argument is of type 'const void *'
 tss_set(tss_t key, void *val)
 ^

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
9 years agogallivm: (trivial) Fix typo in comment introduced by 70dc8a
Alexandre Demers [Fri, 13 Mar 2015 00:50:08 +0000 (20:50 -0400)]
gallivm: (trivial) Fix typo in comment introduced by 70dc8a

Fix typo in comment introduced by 70dc8a

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Jose Fonseca <jfonseca@vmware.com>
9 years agomesa: improve ARB_copy_image internal format compat check
Seán de Búrca [Sat, 7 Mar 2015 09:23:53 +0000 (02:23 -0700)]
mesa: improve ARB_copy_image internal format compat check

The memory layout of compatible internal formats may differ in bytes per
block, so TexFormat is not a reliable measure of compatibility. For example,
GL_RGB8 and GL_RGB8UI are compatible formats, but GL_RGB8 may be laid out in
memory as B8G8R8X8. If GL_RGB8UI has a 3 byte-per-block memory layout, the
existing compatibility check will fail.

Additionally, the current check allows any two compressed textures which share
block size to be used, whereas the spec gives an explicit table of compatible
formats.

v2: Use a switch instead of array iteration for block class and show the
    correct GL error when internal formats are mismatched.
v3: Include spec citations for new compatibility checks, rearrange check
    order to ensure that compressed, view-compatible formats return the
    correct result, and make style fixes. Original commit message amended
    for clarity.
v4: Reformatted spec citations.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agonir: Fix non-determinism in nir_lower_vars_to_ssa().
Kenneth Graunke [Tue, 10 Mar 2015 01:36:31 +0000 (18:36 -0700)]
nir: Fix non-determinism in nir_lower_vars_to_ssa().

Previously, we stored derefs in a hash table, using the malloc'd pointer
as the key.  Then, we walked through the hash table and generated code,
based on the order of the hash table's elements.

Memory addresses returned by malloc are pretty much random, which meant
that the hash was random, and the hash table's elements would be walked
in some random order.  This led to successive compiles of the same
shader using different variable names and slightly different orderings
of phi-nodes.  Code could not be diff'd, and the final assembly would
sometimes change slightly too.

It turns out the only point of the hash table was to avoid inserting
the same node multiple times for different dereferences.  We never
actually searched the hash table!  This patch uses an intrusive
linked list instead.  Since exec_list uses head and tail sentinels,
checking prev or next against NULL will tell us whether the node is
already in the list.

Pair programming with Jason Ekstrand.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
9 years agoutil: Fix foreach_list_typed_safe when exec_node is not at offset 0.
Jason Ekstrand [Tue, 10 Mar 2015 01:36:30 +0000 (18:36 -0700)]
util: Fix foreach_list_typed_safe when exec_node is not at offset 0.

__next and __prev are pointers to the structure containing the exec_node
link, not the embedded exec_node.  NULL checks would fail unless the
embedded exec_node happened to be at offset 0 in the parent struct.

v2: Jason Ekstrand <jason.ekstrand@intel.com>:
   Use "(__node)->__field.next != NULL" to check for the end of the list
   instead of the "&__next->__field != NULL".  The former is far more
   obviously correct as it matches what the non-safe versions do.  The
   original code tried to avoid any use of __next as the client code may
   delete it during its execution.  However, since the looping condition is
   checked after the iteration clause but before the client code is
   executed, we know that __node is valid during the looping condition.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965: Use NIR for scalar VS when INTEL_USE_NIR is set.
Kenneth Graunke [Mon, 9 Mar 2015 08:58:59 +0000 (01:58 -0700)]
i965: Use NIR for scalar VS when INTEL_USE_NIR is set.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agoi965/fs: Add VS output support to nir_setup_outputs().
Kenneth Graunke [Mon, 9 Mar 2015 08:58:58 +0000 (01:58 -0700)]
i965/fs: Add VS output support to nir_setup_outputs().

Adapted from fs_visitor::visit(ir_variable *).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agoi965/fs: Handle VS inputs in the NIR backend.
Kenneth Graunke [Mon, 9 Mar 2015 08:58:57 +0000 (01:58 -0700)]
i965/fs: Handle VS inputs in the NIR backend.

(Jason noted that this is not a good long term solution, and we should
instead improve nir_lower_io so that this extra set of MOVs is
unnecessary.  I tend to agree, but decided we could do that as a
follow-up improvement.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agoi965/fs: Refactor fs_visitor::nir_setup_inputs().
Kenneth Graunke [Mon, 9 Mar 2015 08:58:56 +0000 (01:58 -0700)]
i965/fs: Refactor fs_visitor::nir_setup_inputs().

No functional change.  In preparation for supporting vertex shaders,
this adds a switch statement on shader stage (since vertex attributes
and fragment shader varyings will need different handling).  It also
renames "varying" to "input", to be more general.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agoi965: Implement NIR intrinsics for loading VS system values.
Kenneth Graunke [Mon, 9 Mar 2015 08:58:55 +0000 (01:58 -0700)]
i965: Implement NIR intrinsics for loading VS system values.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agonir: Add intrinsics for SYSTEM_VALUE_BASE_VERTEX and VERTEX_ID_ZERO_BASE
Kenneth Graunke [Mon, 9 Mar 2015 08:58:54 +0000 (01:58 -0700)]
nir: Add intrinsics for SYSTEM_VALUE_BASE_VERTEX and VERTEX_ID_ZERO_BASE

Ian and I added these around the time Connor was developing NIR.  Now
that both exist, we should make them work together!

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agoi965/nir: Lower to registers a bit later.
Kenneth Graunke [Mon, 9 Mar 2015 08:58:53 +0000 (01:58 -0700)]
i965/nir: Lower to registers a bit later.

We can't safely call nir_optimize() with register present, since several
passes called in the loop can't handle registers, and will fail asserts.

Notably, nir_lower_vec_alus() and nir_opt_algebraic() really don't want
registers.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agoi965/nir: Optimize after nir_lower_var_copies().
Kenneth Graunke [Mon, 9 Mar 2015 08:58:52 +0000 (01:58 -0700)]
i965/nir: Optimize after nir_lower_var_copies().

Array variable copy splitting generates a bunch of stuff we want to
clean up before proceeding.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agoi965/fs: Store a pointer to brw_sampler_prog_key_data in the visitor.
Kenneth Graunke [Mon, 9 Mar 2015 08:58:51 +0000 (01:58 -0700)]
i965/fs: Store a pointer to brw_sampler_prog_key_data in the visitor.

The NIR backend hardcodes brw_wm_prog_key at the moment, which won't
work when we support scalar VS.  We could use get_tex(), but it's a
static method.  I was going to promote it to fs_visitor, but then
realized that both parameters (stage and key) are already members.

It then occured to me that we could just set up a pointer in the
constructor, and skip having a function altogether.

This patch also converts all existing users to use key_tex.

v2: Make key_tex a "const brw_sampler_prog_key_data *" instead of
    non-const; word-wrap some lines.  (Review comments from Topi.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agotnl: HAVE_LE32_VERTS is never defined, remove associated code
Brian Paul [Wed, 11 Mar 2015 23:10:53 +0000 (17:10 -0600)]
tnl: HAVE_LE32_VERTS is never defined, remove associated code

Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agomesa: move LONGSTRING into generated enums.c
Brian Paul [Wed, 11 Mar 2015 22:54:15 +0000 (16:54 -0600)]
mesa: move LONGSTRING into generated enums.c

enums.c is the only place this directive is needed.

Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agomesa: remove _ASMAPI, ASMAPIP
Brian Paul [Wed, 11 Mar 2015 14:38:09 +0000 (08:38 -0600)]
mesa: remove _ASMAPI, ASMAPIP

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agomesa: remove _XFORMAPI
Brian Paul [Wed, 11 Mar 2015 14:33:21 +0000 (08:33 -0600)]
mesa: remove _XFORMAPI

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agoswrast: remove _BLENDAPI
Brian Paul [Wed, 11 Mar 2015 14:29:56 +0000 (08:29 -0600)]
swrast: remove _BLENDAPI

_BLENDAPI boils down to __cdecl on Windows, but __cdecl is the default
calling convention so this serves no purpose.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agomesa: use ARRAY_SIZE in _mesa_QueryMatrixxOES()
Brian Paul [Sat, 7 Mar 2015 20:15:22 +0000 (13:15 -0700)]
mesa: use ARRAY_SIZE in _mesa_QueryMatrixxOES()

Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agomesa: remove register keyword, add const in _mesa_QueryMatrixxOES()
Brian Paul [Sat, 7 Mar 2015 20:15:22 +0000 (13:15 -0700)]
mesa: remove register keyword, add const in _mesa_QueryMatrixxOES()

Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agomesa: reindent querymatrix.c
Brian Paul [Sat, 7 Mar 2015 20:15:22 +0000 (13:15 -0700)]
mesa: reindent querymatrix.c

Use 3-space indents, not 4.  Move some comments after the case statements.

Acked-by: Matt Turner <mattst88@gmail.com>
9 years agomesa: move fpclassify work-arounds into c99_math.h
Brian Paul [Sat, 7 Mar 2015 20:15:22 +0000 (13:15 -0700)]
mesa: move fpclassify work-arounds into c99_math.h

v2: Use #error in the #else clause, per Jose.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
9 years agogallivm: Prevent double delete on LLVM 3.6
Jose Fonseca [Thu, 12 Mar 2015 09:57:43 +0000 (09:57 +0000)]
gallivm: Prevent double delete on LLVM 3.6

std::unique_ptr takes ownership of MM, and a double delete could ensure
in case of an error,  as pointed out by Chris Vine in
https://bugs.freedesktop.org/show_bug.cgi?id=89387

Reviewed-by: Chris Vine <chris@cvine.freeserve.co.uk>
9 years agoautogen.sh: pass --force to autoreconf, quote ORIGDIR
Emil Velikov [Mon, 9 Mar 2015 11:46:07 +0000 (11:46 +0000)]
autogen.sh: pass --force to autoreconf, quote ORIGDIR

By passing --force autoreconf will update all the aux files, which would
otherwise be ignored if one updates autoconf/automake.

Quote the ORIGDIR variable to prevent fall-outs, when its name contains
space.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agoglx: remove support for non-multithreaded platforms
Emil Velikov [Fri, 6 Mar 2015 16:54:59 +0000 (16:54 +0000)]
glx: remove support for non-multithreaded platforms

Implicitly required for a while, although commit 9385c592c68 (mapi:
remove u_thread.h) was the one that put the final nail on the
coffin.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agoglx: remove final reference to THREADS
Emil Velikov [Fri, 6 Mar 2015 16:54:58 +0000 (16:54 +0000)]
glx: remove final reference to THREADS

Left over from commit 18db13f5865(mapi: THREADS was always defined,
remove it)

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agoconfigure: require pthreads for POSIX builds
Emil Velikov [Fri, 6 Mar 2015 16:54:57 +0000 (16:54 +0000)]
configure: require pthreads for POSIX builds

This has been an implicit rule for building mesa for a long time. Let's
make it official and just bail out at configure time. This way we can
cleaning up some of our glx code.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agoegl/main: convert thread management to use c11 threads
Emil Velikov [Fri, 6 Mar 2015 16:54:56 +0000 (16:54 +0000)]
egl/main: convert thread management to use c11 threads

Convert the code to use the C11 threads implementation, and nuke the
Windows non-pthreads code-path. The c11/threads_win32.h abstraction
should be better than the current code.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agoegl/main: use c11/threads' mutex directly
Emil Velikov [Fri, 6 Mar 2015 16:54:55 +0000 (16:54 +0000)]
egl/main: use c11/threads' mutex directly

Remove the inline wrappers/abstraction layer.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agonir/worklist: Don't change the start index when computing the tail index
Jason Ekstrand [Tue, 3 Mar 2015 01:59:38 +0000 (17:59 -0800)]
nir/worklist: Don't change the start index when computing the tail index

Reviewed-by: Mark Janes <mark.a.janes@intel.com>
9 years agonir: Optimize a + neg(a)
Thomas Helland [Sat, 28 Feb 2015 19:32:32 +0000 (20:32 +0100)]
nir: Optimize a + neg(a)

Shader-db i965 instructions:
total instructions in shared programs: 1711180 -> 1711159 (-0.00%)
instructions in affected programs:     825 -> 804 (-2.55%)
helped:                                9
HURT:                                  0
GAINED:                                3
LOST:                                  3

Shader-db NIR instructions:
total instructions in shared programs: 606187 -> 606179 (-0.00%)
instructions in affected programs:     298 -> 290 (-2.68%)
helped:                                4
HURT:                                  0
GAINED:                                0
LOST:                                  0

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
9 years agonir: Optimize (a*b)+(a*c) -> a*(b+c)
Thomas Helland [Sat, 28 Feb 2015 19:32:31 +0000 (20:32 +0100)]
nir: Optimize (a*b)+(a*c) -> a*(b+c)

Shader-db i965 instructions:
total instructions in shared programs: 1715894 -> 1710802 (-0.30%)
instructions in affected programs:     443080 -> 437988 (-1.15%)
helped:                                1502
HURT:                                  13
GAINED:                                4
LOST:                                  4

Shader-db NIR instructions:
total instructions in shared programs: 607710 -> 606187 (-0.25%)
instructions in affected programs:     208285 -> 206762 (-0.73%)
helped:                                769
HURT:                                  8
GAINED:                                0
LOST:                                  0

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
9 years agovbo: improve the code style by adjust the preprocessing c code directives
Marius Predut [Wed, 11 Mar 2015 09:25:00 +0000 (03:25 -0600)]
vbo: improve the code style by adjust the preprocessing c code directives

Brian Paul review suggestion: there's more macro use here than necessary.
Removed and redefine some #define preprocessing directives.
Removed the directive input parameter 'T' .
No functional changes.

Signed-off-by: Marius Predut <marius.predut@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agomesa: remove CPU_TO_LE32() for AIX
Brian Paul [Sun, 8 Mar 2015 22:46:39 +0000 (16:46 -0600)]
mesa: remove CPU_TO_LE32() for AIX

This is the only remnant of AIX-specific code in Mesa.  Probably long
unused.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agomesa: remove #define __volatile
Brian Paul [Sun, 8 Mar 2015 22:44:28 +0000 (16:44 -0600)]
mesa: remove #define __volatile

Not actually used anwhere in Mesa.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agomesa: use strdup() instead of _mesa_strdup()
Brian Paul [Sat, 7 Mar 2015 20:15:22 +0000 (13:15 -0700)]
mesa: use strdup() instead of _mesa_strdup()

We were already using strdup() in various places in Mesa.  Get rid
of the _mesa_strdup() wrapper.  All the callers pass a non-NULL
argument so the NULL check isn't needed either.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agost/glx: use strdup() instead of _mesa_strdup()
Brian Paul [Sat, 7 Mar 2015 20:15:22 +0000 (13:15 -0700)]
st/glx: use strdup() instead of _mesa_strdup()

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agoxlib: use strdup() instead of _mesa_strdup()
Brian Paul [Sat, 7 Mar 2015 20:15:22 +0000 (13:15 -0700)]
xlib: use strdup() instead of _mesa_strdup()

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agoi915: add parens to silence operator precedence warning
Brian Paul [Tue, 10 Mar 2015 14:18:27 +0000 (08:18 -0600)]
i915: add parens to silence operator precedence warning

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agoi965: Fix out-of-bounds accesses into pull_constant_loc array
Iago Toral Quiroga [Tue, 10 Mar 2015 10:36:43 +0000 (11:36 +0100)]
i965: Fix out-of-bounds accesses into pull_constant_loc array

The piglit test glsl-fs-uniform-array-loop-unroll.shader_test was designed
to do an out of bounds access into an uniform array to make sure that we
handle that situation gracefully inside the driver, however, as Ken describes
in bug 79202, Valgrind reports that this is leading to an out-of-bounds access
in fs_visitor::demote_pull_constants().

Before accessing the pull_constant_loc array we should make sure that
the uniform we are trying to access is valid.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79202
Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agoi965/gen6 gs: Convert brw_imm_ud/brw_imm_d to src_reg
Jordan Justen [Sat, 21 Feb 2015 23:05:22 +0000 (15:05 -0800)]
i965/gen6 gs: Convert brw_imm_ud/brw_imm_d to src_reg

Same idea as this patch, only for gen6_gs_visitor:

commit 49a938a265f5959c9b558995cc658f80acb6eb18
Author: Jordan Justen <jordan.l.justen@intel.com>
Date:   Fri Feb 20 12:12:25 2015 -0800
    i965/fs: Use fs_reg for CS/VS atomics pixel mask immediate data

Suggested-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agoi965/fs: Use unsigned for CS/VS atomics pixel mask immediate data
Jordan Justen [Sat, 21 Feb 2015 23:00:28 +0000 (15:00 -0800)]
i965/fs: Use unsigned for CS/VS atomics pixel mask immediate data

brw_imm_ud(0xffff) should have been converted to fs_reg(0xffffu) to
make sure the uint32_t fs_reg constructor was matched.

commit 49a938a265f5959c9b558995cc658f80acb6eb18
Author: Jordan Justen <jordan.l.justen@intel.com>
Date:   Fri Feb 20 12:12:25 2015 -0800
    i965/fs: Use fs_reg for CS/VS atomics pixel mask immediate data

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agoi965/gen8: Don't allocate hiz miptree structure
Jordan Justen [Sun, 29 Jun 2014 19:06:33 +0000 (12:06 -0700)]
i965/gen8: Don't allocate hiz miptree structure

We now skip allocating a hiz miptree for gen8. Instead, we calculate
the required hiz buffer parameters and allocate a bo directly.

v2:
 * Update hz_height calculation as suggested by Topi
v3:
 * Bail if we failed to create the bo (Ben)
v4:
 * CEILING => DIV_ROUND_UP
 * Make sure mt->logical_depth0 being 0 would not cause trouble
 * Fail if Y tiling is not returned

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67564
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
9 years agoi965/gen7: Don't allocate hiz miptree structure
Jordan Justen [Sun, 29 Jun 2014 19:06:33 +0000 (12:06 -0700)]
i965/gen7: Don't allocate hiz miptree structure

We now skip allocating a hiz miptree for gen7. Instead, we calculate
the required hiz buffer parameters and allocate a bo directly.

v2:
 * Update hz_height calculation as suggested by Topi
v3:
 * Bail if we failed to create the bo (Ben)
v4:
 * CEILING => DIV_ROUND_UP
 * Make sure mt->logical_depth0 being 0 would not cause trouble
 * Fail if Y tiling is not returned

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67564
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
9 years agoi965/gen8: Don't rely directly on the hiz miptree structure
Jordan Justen [Sun, 29 Jun 2014 19:06:33 +0000 (12:06 -0700)]
i965/gen8: Don't rely directly on the hiz miptree structure

We are still allocating a miptree for hiz, but we only use fields from
intel_miptree_aux_buffer. This will allow us to switch over to not
allocating a miptree.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
9 years agoi965/gen7: Don't rely directly on the hiz miptree structure
Jordan Justen [Sun, 29 Jun 2014 19:06:33 +0000 (12:06 -0700)]
i965/gen7: Don't rely directly on the hiz miptree structure

We are still allocating a miptree for hiz, but we only use fields from
intel_miptree_aux_buffer. This will allow us to switch over to not
allocating a miptree.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
9 years agoi965/hiz: Start to separate miptree out from hiz buffers
Jordan Justen [Sun, 29 Jun 2014 18:55:26 +0000 (11:55 -0700)]
i965/hiz: Start to separate miptree out from hiz buffers

Today we allocate a miptree's for the hiz buffer. We needed this in
the past because we would point the hardware at offsets of the hiz
buffer. Since the hiz format is not documented, this is not a good
idea.

Since moving to support layered rendering on Gen7+, we no longer point
at an offset into the buffer on Gen7+.

Therefore, to support hiz on Gen7+, we don't need a full miptree
structure allocated.

This patch starts to create a new auxiliary buffer structure
(intel_miptree_aux_buffer) that can be a more simplistic miptree
side-band buffer associated with a miptree. (For example, to serve the
needs of the hiz buffer.)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
9 years agomesa/scissor: fix typos in debug names
Dave Airlie [Tue, 10 Mar 2015 06:45:18 +0000 (16:45 +1000)]
mesa/scissor: fix typos in debug names

Just noticed this when working on virgl.

Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agonvc0: fix wrong max value for driver queries
Samuel Pitoiset [Sun, 8 Mar 2015 16:18:07 +0000 (17:18 +0100)]
nvc0: fix wrong max value for driver queries

The maximum value of a Gallium HUD's panel is automatically adjusted
when the current value is greater than the max. If we set the
pipe_query_driver_info::max_value to UINT64_MAX, the maximum value is
never adjusted and this results in a flat line instead of a pretty curve
which is correctly scaled.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agoi965: Silence GCC maybe-uninitialized warning.
Vinson Lee [Sat, 7 Mar 2015 06:08:00 +0000 (22:08 -0800)]
i965: Silence GCC maybe-uninitialized warning.

brw_shader.cpp: In function ‘bool brw_saturate_immediate(brw_reg_type, brw_reg*)’:
brw_shader.cpp:618:31: warning: ‘sat_imm.brw_saturate_immediate(brw_reg_type, brw_reg*)::<anonymous union>::ud’ may be used uninitialized in this function [-Wmaybe-uninitialized]
       reg->dw1.ud = sat_imm.ud;
                               ^

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agoi915: Fix GCC unused-but-set-variable warning in release build.
Vinson Lee [Sat, 7 Mar 2015 05:52:31 +0000 (21:52 -0800)]
i915: Fix GCC unused-but-set-variable warning in release build.

i915_fragprog.c: In function ‘i915ValidateFragmentProgram’:
i915_fragprog.c:1453:11: warning: variable ‘k’ set but not used [-Wunused-but-set-variable]
       int k;
           ^

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agoAdd macro for unused function attribute.
Vinson Lee [Sat, 7 Mar 2015 22:07:10 +0000 (14:07 -0800)]
Add macro for unused function attribute.

Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
9 years agometa: Plug memory leak
Ben Widawsky [Sat, 7 Mar 2015 01:31:00 +0000 (17:31 -0800)]
meta: Plug memory leak

It looks like this has existed since
commit f5a477ab76b6e0b268387699cd2253a43db0dfae
Author: Ian Romanick <ian.d.romanick@intel.com>
Date:   Mon Dec 16 11:54:08 2013 -0800

    meta: Refactor shader generation code out of mipmap generation path

Valgrind was complaining on fbo-generatemipmap-formats

v2: Instead, do the allocation after the early return block (v2)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965/fs: Don't issue FB writes for bound but unwritten color targets.
Kenneth Graunke [Fri, 27 Feb 2015 01:45:49 +0000 (17:45 -0800)]
i965/fs: Don't issue FB writes for bound but unwritten color targets.

We used to loop over all color attachments, and emit FB writes for each
one, even if the shader didn't write to a corresponding output variable.
Those color attachments would be filled with garbage (undefined values).

Football Manager binds a framebuffer with 4 color attachments, but draws
to it using a shader that only writes to gl_FragData[0..2].  This meant
that color attachment 3 would be filled with garbage, resulting in
rendering artifacts.  Now we skip writing to it, fixing rendering.

Writes to gl_FragColor initialize outputs[0..nr_color_regions-1] to
GRFs, while writes to gl_FragData[i] initialize outputs[i].

Thanks to Jason Ekstrand for tracking this down.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86747
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: mesa-stable@lists.freedesktop.org
9 years agoi965/fs: Make emit_shader_time_end() insert before EOT.
Kenneth Graunke [Fri, 27 Feb 2015 06:55:54 +0000 (22:55 -0800)]
i965/fs: Make emit_shader_time_end() insert before EOT.

Previously, we emitted the shader-time epilogue from emit_fb_writes(),
during the middle of looping through color regions (or emit_urb_writes
for the VS).  This is duplicated several times and rather awkward.

I need to fix a bug in our FB write handling, and it will be a lot
easier if we move emit_shader_time_end() out of there.

Now, we simply emit FB writes/URB writes, and subsequently have
emit_shader_time_end() insert instructions before the final SEND with
EOT.  Not only is this simpler, it's actually a slight improvement:
we now include the MOVs to set up the final FB write payload in our
shader-time measurements.

Note that INTEL_DEBUG=shader_time only exists on Gen7+, and uses
send-from-GRF.  (In the past, we might have hit trouble where both
attempt to use MRFs for messages; that's not a problem now.)

v2: Rebase on v3 of the previous patch and other shader_time fixes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> [v1]
Acked-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
9 years agoi965/fs: Make get_timestamp() pass back the MOV rather than emitting it.
Kenneth Graunke [Fri, 27 Feb 2015 07:51:27 +0000 (23:51 -0800)]
i965/fs: Make get_timestamp() pass back the MOV rather than emitting it.

This makes another part of the INTEL_DEBUG=shader_time code emittable
at arbitrary locations, rather than just at the end of the instruction
stream.

v2: Don't lose smear!  Caught by Topi Pohjolainen.
v3: Don't set smear on the destination of the MOV.  Thanks Topi!

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
9 years agoi965/fs: Make emit_shader_time_write return rather than emit.
Kenneth Graunke [Fri, 27 Feb 2015 06:49:04 +0000 (22:49 -0800)]
i965/fs: Make emit_shader_time_write return rather than emit.

Instead of emit_shader_time_write, we now do emit(SHADER_TIME_ADD(...)).
The advantage is that we can also insert a shader time write at an
arbitrary location in the instruction stream, rather than being
restricted to emitting at the end.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
9 years agoi965/fs: Set smear on shader_time diff register.
Kenneth Graunke [Sun, 8 Mar 2015 08:13:41 +0000 (00:13 -0800)]
i965/fs: Set smear on shader_time diff register.

The ADD(diff, diff, fs_reg(-2u)) instruction reads diff, which is a
width 1 register.  We need to read it as <0,1,0> with a subreg of 0,
which is what smear accomplishes.

Fixes assertion:
brw_eu_emit.c:285: validate_reg: Assertion `hstride == 0' failed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86974
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
9 years agoi965/fs: Set force_writemask_all on shader_time instructions.
Kenneth Graunke [Sun, 8 Mar 2015 07:01:07 +0000 (23:01 -0800)]
i965/fs: Set force_writemask_all on shader_time instructions.

These computations don't have anything to do with the currently
executing channels, so they should use force_writemask_all.

This fixes assert failures.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86974
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
9 years agor600g: Use R600_MAX_VIEWPORTS instead of 16
Alexandre Demers [Wed, 25 Feb 2015 06:50:49 +0000 (01:50 -0500)]
r600g: Use R600_MAX_VIEWPORTS instead of 16

Lets define R600_MAX_VIEWPORTS instead of using 16 here and there
in the code when looping through viewports and scissors. It is
easier to understand what this number represents.

v2: Missed a case where R600_MAX_VIEWPORTS should have been used.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
9 years agoi915: Remove unused IS_GEN2 macro
Ian Romanick [Thu, 5 Mar 2015 19:26:53 +0000 (11:26 -0800)]
i915: Remove unused IS_GEN2 macro

Inspired by Damien's recent libdrm changes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
9 years agoi915: Remove (mostly) unused IS_915 macro
Ian Romanick [Thu, 5 Mar 2015 18:55:32 +0000 (10:55 -0800)]
i915: Remove (mostly) unused IS_915 macro

Inspired by Damien's recent libdrm changes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
9 years agoi915: Remove (mostly) unused IS_PNV, IS_PNVG, and IS_PNVGM macros
Ian Romanick [Thu, 5 Mar 2015 18:47:56 +0000 (10:47 -0800)]
i915: Remove (mostly) unused IS_PNV, IS_PNVG, and IS_PNVGM macros

Inspired by Damien's recent libdrm changes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
9 years agoi915: Remove IS_9XX macro
Ian Romanick [Thu, 5 Mar 2015 18:27:04 +0000 (10:27 -0800)]
i915: Remove IS_9XX macro

Since the i915 / i965 split, IS_9XX just means IS_GEN3.  Inspired by
Damien's recent libdrm changes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
9 years agoi915: Remove unused IS_MOBILE macro
Ian Romanick [Thu, 5 Mar 2015 18:24:57 +0000 (10:24 -0800)]
i915: Remove unused IS_MOBILE macro

Inspired by Damien's recent libdrm changes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
9 years agoi965: Don't write past the end of the application supplied buffer
Ian Romanick [Sat, 28 Feb 2015 02:43:00 +0000 (18:43 -0800)]
i965: Don't write past the end of the application supplied buffer

Both the AMD and Intel APIs provide a dataSize parameter, and this
function would merrily ignore it.  Neither API specifies what to do when
the buffer isn't big enough.  I take the easy route of writing all the
complete bits of data that will fit.  With more complete specs, we could
probably do something different.

I noticed this while looking into an unused parameter warning.  The
warning was actually useful!

brw_performance_monitor.c: In function 'brw_get_perf_monitor_result':
brw_performance_monitor.c:1261:37: warning: unused parameter 'data_size' [-Wunused-parameter]
                             GLsizei data_size,
                                     ^

v2: Fix checks to include offset in the calculation.  Noticed by Jan.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
9 years agoi965: Silence unused parameter warning
Ian Romanick [Sat, 28 Feb 2015 02:42:03 +0000 (18:42 -0800)]
i965: Silence unused parameter warning

All dd functions take a gl_context as the first parameter.  Instead of
removing it, just silence the warning.

brw_performance_monitor.c: In function 'brw_new_perf_monitor':
brw_performance_monitor.c:1354:41: warning: unused parameter 'ctx' [-Wunused-parameter]
 brw_new_perf_monitor(struct gl_context *ctx)
                                         ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>