git.libre-soc.org Git - mesa.git/log

Jordan Justen [Mon, 28 Mar 2016 19:08:49 +0000 (12:08 -0700)]

anv: Fix cache pollution race during L3 partitioning set-up.

Port 0aa4f99f562a05880a779707cbcd46be459863bf to anv.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Mon, 28 Mar 2016 19:27:40 +0000 (12:27 -0700)]

nir/spirv: Remove the NoContraction hack

NIR now just handles this for us by not fusing if the multiply is marked as
exact.

commit | commitdiff | tree

Jason Ekstrand [Mon, 28 Mar 2016 18:47:27 +0000 (11:47 -0700)]

i965/peephole_ffma: Only match a mul+add if none of the ops are exact

commit | commitdiff | tree

Jason Ekstrand [Mon, 28 Mar 2016 18:12:33 +0000 (11:12 -0700)]

nir/search: Don't match inexact expressions with exact subexpressions

In the first pass of implementing exact handling, I made a mistake with
search-and-replace.  In particular, we only reallly handled exact/inexact
on the root of the tree.  Instead, we need to check every node in the tree
for an exact/inexact match.  As an example of this, consider the following
GLSL code

precise float a = b + c;
if (a < 0) {
   do_stuff();
}

In that case, only the add will be declared "exact" and an expression that
looks for "b + c < 0" will still match and replace it with "b < -c" which
may yield different results.  The solution is to simply bail if any of the
values are exact when matching an inexact expression.

commit | commitdiff | tree

Jason Ekstrand [Fri, 25 Mar 2016 22:53:40 +0000 (15:53 -0700)]

i965: Allow mul+add fusing again

commit | commitdiff | tree

Jason Ekstrand [Fri, 25 Mar 2016 22:30:46 +0000 (15:30 -0700)]

spirv/alu: Add support for the NoContraction decoration

commit | commitdiff | tree

Jason Ekstrand [Fri, 25 Mar 2016 22:15:45 +0000 (15:15 -0700)]

spirv/glsl: Add a helper for converting glsl opcodes into nir opcodes

This is similar to the way that regular ALU operations are handled.

commit | commitdiff | tree

Jason Ekstrand [Fri, 25 Mar 2016 21:40:57 +0000 (14:40 -0700)]

nir/spirv: Get rid of the spirv2nir helper binary

This was useful once upon a time but now that we have a real Vulkan driver
to run our SPIR-V binaries through, there's really no point.

commit | commitdiff | tree

Nanley Chery [Fri, 18 Mar 2016 22:12:32 +0000 (15:12 -0700)]

anv/blit2d: Add a function to create an ImageView

This function differs from the open-coded implementation in that the
ImageView's width is determined by the caller and is not unconditionally
set to match the number of texels within the surface's pitch.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

commit | commitdiff | tree

Nanley Chery [Mon, 21 Mar 2016 17:41:06 +0000 (10:41 -0700)]

anv/image: Enable specifying a surface's minimum pitch

This is required to create multiple, horizontally adjacent, max-width
images from one blit2d surface. This is also required for more accurate
width specification of surfaces within a larger surface (which is seen
as the smaller surface's enclosing region).

Note that anv_image_create_info::stride has been unused since commit,
b36938964063a4072abfd779f5607743dbc3b6f1 .

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

commit | commitdiff | tree

Jason Ekstrand [Fri, 25 Mar 2016 20:55:37 +0000 (13:55 -0700)]

i965/vec4: Get rid of a stray predicate inverse in opquantizef16

This fixes 30 opquantize CTS tests on HSW

commit | commitdiff | tree

Jason Ekstrand [Fri, 25 Mar 2016 19:12:12 +0000 (12:12 -0700)]

nir/algebraic: Get rid of a redundant copy of fdiv lowering

commit | commitdiff | tree

Jason Ekstrand [Fri, 25 Mar 2016 19:09:33 +0000 (12:09 -0700)]

nir/algebraic: Add better lowering of ldexp

commit | commitdiff | tree

Jason Ekstrand [Fri, 25 Mar 2016 17:43:17 +0000 (10:43 -0700)]

nir/builder: Simplify nir_ssa_undef a bit

commit | commitdiff | tree

Jason Ekstrand [Fri, 25 Mar 2016 17:40:45 +0000 (10:40 -0700)]

nir/spirv: Use the nir_ssa_undef helper from nir_builder

commit | commitdiff | tree

Jason Ekstrand [Fri, 25 Mar 2016 17:40:24 +0000 (10:40 -0700)]

nir/builder: Add a bit size field to nir_ssa_undef

commit | commitdiff | tree

Jason Ekstrand [Fri, 25 Mar 2016 17:12:52 +0000 (10:12 -0700)]

nir: Add a better comment for INTRINSIC_RANGE

commit | commitdiff | tree

Jason Ekstrand [Fri, 25 Mar 2016 17:05:36 +0000 (10:05 -0700)]

nir/glsl: Stop carying a pointer to the nir_shader in the visitor

commit | commitdiff | tree

Jordan Justen [Thu, 24 Mar 2016 20:05:04 +0000 (13:05 -0700)]

anv: Use genxml register support for L3 Cache config

The programming of the L3 Cache registers should match the previous
manually packed LRI values.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Jordan Justen [Thu, 24 Mar 2016 07:29:50 +0000 (00:29 -0700)]

genxml: Add L3 Cache Control register definitions

Based on intel_reg.h (5912da45a69923afa1b7f2eb5bb371d848813c41)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Jordan Justen [Thu, 24 Mar 2016 06:24:25 +0000 (23:24 -0700)]

anv: Add genxml register support

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Jordan Justen [Thu, 24 Mar 2016 06:24:25 +0000 (23:24 -0700)]

genxml: Add register support

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Fri, 25 Mar 2016 00:30:14 +0000 (17:30 -0700)]

Merge remote-tracking branch 'public/master' into vulkan

commit | commitdiff | tree

Nanley Chery [Tue, 22 Mar 2016 17:53:37 +0000 (10:53 -0700)]

anv: Sanitize Image extents and offsets

Prepare Image extents and offsets for internal consumption by assigning
the default values implicitly defned by the spec. Fixes textures on
several Vulkan demos in which the VkImageCopy depth is set to zero when
copying a 2D image.

v2 (Jason Ekstrand):
   Replace "prep" with "sanitize"
   Make function static inline
   Pass structs instead of pointers

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Sun, 14 Feb 2016 01:31:05 +0000 (17:31 -0800)]

nir: Add a pass to inline functions

This commit adds a new NIR pass that lowers all function calls away by
inlining the functions.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Sat, 26 Dec 2015 18:48:14 +0000 (10:48 -0800)]

nir/builder: Add helpers for easily inserting copy_var intrinsics

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Sun, 14 Feb 2016 01:08:57 +0000 (17:08 -0800)]

nir: Add return lowering pass

This commit adds a NIR pass for lowering away returns in functions. If the
return is in a loop, it is lowered to a break. If it is not in a loop,
it's lowered away by moving/deleting code as needed.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Mon, 28 Dec 2015 06:50:14 +0000 (22:50 -0800)]

nir: Add a cursor helper for getting a cursor after any phi nodes

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Sun, 14 Feb 2016 01:14:27 +0000 (17:14 -0800)]

nir/builder: Add a helper for inserting jump instructions

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Thu, 24 Dec 2015 02:10:08 +0000 (18:10 -0800)]

nir/cf: Make extracting or re-inserting nothing a no-op

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Sat, 26 Dec 2015 18:32:10 +0000 (10:32 -0800)]

nir: Add a function for comparing cursors

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Fri, 18 Dec 2015 19:27:00 +0000 (11:27 -0800)]

nir/cf: Handle relinking top-level blocks

This can happen if a function ends in a return instruction and you remove
the return.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Sat, 13 Feb 2016 05:52:46 +0000 (21:52 -0800)]

nir: Add a pass to repair SSA form

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Sat, 13 Feb 2016 05:48:26 +0000 (21:48 -0800)]

nir/vars_to_ssa: Use the new nir_phi_builder helper

The efficiency should be approximately the same.  We do a little more work
per phi node because we have to sort the predecessors.  However, we no
longer have to walk the blocks a second time to pop things off the stack.
The bigger advantage, however, is that we can now re-use the phi placement
and per-block SSA value tracking in other passes.

As a side-benifit, the phi builder actually handles unreachable blocks
correctly.  The original vars_to_ssa code, because of the way it iterated
the blocks and added phi sources, didn't add sources corresponding to
predecessors of unreachable blocks.  The new strategy employed by the phi
builder creates a phi source for each predecessor and should correctly
handle unreachable blocks by setting those sources to SSA undefs.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Tue, 29 Dec 2015 23:25:43 +0000 (15:25 -0800)]

nir/dominance: Handle unreachable blocks

Previously, nir_dominance.c didn't properly handle unreachable blocks.
This can happen if, for instance, you have something like this:

loop {
   if (...) {
      break;
   } else {
      break;
   }
}

In this case, the block right after the if statement will be unreachable.
This commit makes two changes to handle this.  First, it removes an assert
and allows block->imm_dom to be null if the block is unreachable.  Second,
it properly skips unreachable blocks in calc_dom_frontier_cb.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Sat, 13 Feb 2016 05:41:42 +0000 (21:41 -0800)]

nir: Add a phi node placement helper

Right now, we have phi placement code in two places and there are other
places where it would be nice to be able to do this analysis. Instead of
repeating it all over the place, this commit adds a helper for placing all
of the needed phi nodes for a value.

v2: Add better documentation

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Sun, 17 Jan 2016 00:42:06 +0000 (16:42 -0800)]

util/bitset: Allow iterating over const bitsets

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Rob Clark [Thu, 24 Mar 2016 19:44:35 +0000 (15:44 -0400)]

ttn: remove stray global from header

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Samuel Pitoiset [Wed, 23 Mar 2016 22:29:20 +0000 (23:29 +0100)]

nv50/ir: silence unhandled TGSI_PROPERTY_NEXT_SHADER info

radeonsi uses this property to make the best decision about which
shader to compile, but this is not currently used by our codegen.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Kenneth Graunke [Thu, 24 Mar 2016 06:46:12 +0000 (23:46 -0700)]

mesa: Handle negative length in glPushDebugGroup().

The KHR_debug spec doesn't actually say we should handle this, but that
is most likely an oversight - it says to check against strlen and
generate errors if length is negative. It appears they just forgot to
explicitly spell out that we should then proceed to actually handle it.

Fixes crashes from uncaught std::string exceptions in many
dEQP-GLES31.functional.debug.error_filters.* tests.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>

commit | commitdiff | tree

Kenneth Graunke [Thu, 24 Mar 2016 06:35:40 +0000 (23:35 -0700)]

mesa: Make glDebugMessageInsert deal with negative length for all types.

From the KHR_debug spec, section 5.5.5 (Externally Generated Messages):

   "If <length> is negative, it is implied that <buf> contains a null
    terminated string. The error INVALID_VALUE will be generated if the
    number of characters in <buf>, excluding the null terminator when
    <length> is negative, is not less than the value of
    MAX_DEBUG_MESSAGE_LENGTH."

This indicates that length should be set to strlen for all types, not
just GL_DEBUG_TYPE_MARKER.  We want it to be after validate_length()
so we still generate appropriate errors.

Fixes crashes from uncaught std::string exceptions in many
dEQP-GLES31.functional.debug.error_filters.* tests.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>

commit | commitdiff | tree

Kenneth Graunke [Thu, 24 Mar 2016 04:38:42 +0000 (21:38 -0700)]

mesa: Include null terminator in GL_DEBUG_NEXT_LOGGED_MESSAGE_LENGTH.

From the KHR_debug spec:
"Applications can query the number of messages currently in the log by
obtaining the value of DEBUG_LOGGED_MESSAGES, and the string length
(including its null terminator) of the oldest message in the log
through the value of DEBUG_NEXT_LOGGED_MESSAGE_LENGTH."

Because we weren't including the null terminator, many dEQP tests
called glGetDebugMessageLog with a bufSize parameter that was 1 too
small, and unable to contain the message, so we skipped returning it,
failing many cases.

Fixes 298 dEQP-GLES31.functional.debug.* tests.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Stephane Marchesin <stephane.marchesin@gmail.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>

commit | commitdiff | tree

Nicolai Hähnle [Wed, 23 Mar 2016 20:22:16 +0000 (15:22 -0500)]

st/mesa: use RGBA instead of BGRA for SRGB_ALPHA

This fixes a regression introduced by commit a8eea696 "st/mesa: honour sized
internal formats in st_choose_format (v2)".

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94657
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94671
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Wed, 23 Mar 2016 16:58:28 +0000 (11:58 -0500)]

radeonsi: silence a coverity warning

The following Coverity warning

5378      tmpl.fetch_args = atomic_fetch_args;
5379      tmpl.emit = atomic_emit;
>>>     CID 1357115:  Uninitialized variables  (UNINIT)
>>>     Using uninitialized value "tmpl". Field "tmpl.intr_name" is uninitialized.
5380      bld_base->op_actions[TGSI_OPCODE_ATOMUADD] = tmpl;
5381      bld_base->op_actions[TGSI_OPCODE_ATOMUADD].intr_name = "add";

... is a false positive, but what the hell. This change should "fix" it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Thu, 24 Mar 2016 14:30:09 +0000 (08:30 -0600)]

mesa: replace gl_context->Multisample._Enabled with _mesa_is_multisample_enabled.

This removes any dependency on driver validation of the number of
framebuffer samples.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Brian Paul <brianp@vmware.com>

commit | commitdiff | tree

Rob Clark [Tue, 22 Mar 2016 19:02:42 +0000 (15:02 -0400)]

nir: fix dangling ssadef->name ptrs

In many places, the convention is to pass an existing ssadef name ptr
when construction/initializing a new nir_ssa_def. But that goes badly
(as noticed by garbage in nir_print output) when the original string
gets freed.

Just use ralloc_strdup() instead, and add ralloc_free() in the two
places that would care (not that the strings wouldn't eventually get
freed anyways).

Also fixup the nir_search code which was directly setting ssadef->name
to use the parent instruction as memctx.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

commit | commitdiff | tree

Jason Ekstrand [Thu, 24 Mar 2016 04:04:18 +0000 (21:04 -0700)]

glsl: Add propagate_invariance to the other makefile

This fixes the scons build

commit | commitdiff | tree

Jason Ekstrand [Thu, 17 Mar 2016 22:20:20 +0000 (15:20 -0700)]

nir/glsl: Propagate invariant into NIR alu ops

Reviewed-by: Francisco Jerez <currojerez@riseup.net>

commit | commitdiff | tree

Jason Ekstrand [Thu, 17 Mar 2016 21:44:57 +0000 (14:44 -0700)]

glsl/rebalance_tree: Don't handle invariant or precise trees

Reviewed-by: Francisco Jerez <currojerez@riseup.net>

commit | commitdiff | tree

Jason Ekstrand [Thu, 17 Mar 2016 21:41:14 +0000 (14:41 -0700)]

glsl/opt_algebraic: Don't handle invariant or precise trees

Reviewed-by: Francisco Jerez <currojerez@riseup.net>

commit | commitdiff | tree

Jason Ekstrand [Thu, 17 Mar 2016 20:58:40 +0000 (13:58 -0700)]

glsl: Add a pass to propagate the "invariant" and "precise" qualifiers

Reviewed-by: Francisco Jerez <currojerez@riseup.net>

commit | commitdiff | tree

Jason Ekstrand [Thu, 17 Mar 2016 23:13:40 +0000 (16:13 -0700)]

nir/alu_to_scalar: Propagate the "exact" bit

Reviewed-by: Francisco Jerez <currojerez@riseup.net>

commit | commitdiff | tree

Jason Ekstrand [Thu, 17 Mar 2016 20:39:07 +0000 (13:39 -0700)]

i965/peephole_ffma: Don't fuse exact adds

Reviewed-by: Francisco Jerez <currojerez@riseup.net>

commit | commitdiff | tree

Jason Ekstrand [Thu, 17 Mar 2016 18:38:54 +0000 (11:38 -0700)]

nir/cse: Properly handle nir_ssa_def.exact

Reviewed-by: Francisco Jerez <currojerez@riseup.net>

commit | commitdiff | tree

Jason Ekstrand [Thu, 17 Mar 2016 18:31:48 +0000 (11:31 -0700)]

nir/algebraic: Flag inexact optimizations

Many of our optimizations, while great for cutting shaders down to size,
aren't really precision-safe. This commit tries to flag all of the
inexact floating-point optimizations so they don't get run on values that
are flagged "exact". It's a bit conservative and maybe flags some safe
optimizations as unsafe but that's better than missing one.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>

commit | commitdiff | tree

Jason Ekstrand [Wed, 23 Mar 2016 21:30:29 +0000 (14:30 -0700)]

nir/algebraic: Fix fmin detection to match the spec

The previous transformation got the arguments to fmin backwards. When NaNs
are involved, the GLSL min/max aren't commutative so it matters.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>

commit | commitdiff | tree

Jason Ekstrand [Wed, 23 Mar 2016 21:25:56 +0000 (14:25 -0700)]

nir/algebraic: Get rid of an invlid fxor optimization

The fxor opcode is required to return 1.0f or 0.0f but the input variable
may not be 1.0f or 0.0f.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>

commit | commitdiff | tree

Jason Ekstrand [Thu, 17 Mar 2016 18:04:49 +0000 (11:04 -0700)]

nir/algebraic: Allow for flagging operations as being inexact

Reviewed-by: Francisco Jerez <currojerez@riseup.net>

commit | commitdiff | tree

Jason Ekstrand [Thu, 17 Mar 2016 22:20:34 +0000 (15:20 -0700)]

nir/search: Propagate exactness into newly created expressions

Reviewed-by: Francisco Jerez <currojerez@riseup.net>

commit | commitdiff | tree

Jason Ekstrand [Thu, 17 Mar 2016 22:54:26 +0000 (15:54 -0700)]

nir/builder: Add a flag for setting exact

Reviewed-by: Francisco Jerez <currojerez@riseup.net>

commit | commitdiff | tree

Jason Ekstrand [Thu, 17 Mar 2016 17:50:27 +0000 (10:50 -0700)]

nir: Add an "exact" bit to nir_alu_instr

Reviewed-by: Francisco Jerez <currojerez@riseup.net>

commit | commitdiff | tree

Jason Ekstrand [Wed, 23 Mar 2016 22:05:55 +0000 (15:05 -0700)]

nir/clone: Export nir_variable_clone

Reviewed-by: Rob Clark <robclark@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Thu, 31 Dec 2015 02:44:19 +0000 (18:44 -0800)]

nir/clone: Expose nir_constant_clone

Reviewed-by: Rob Clark <robclark@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 23 Mar 2016 21:57:57 +0000 (14:57 -0700)]

nir: Fix whitespace

Reviewed-by: Rob Clark <robclark@gmail.com>

commit | commitdiff | tree

Brian Paul [Wed, 23 Mar 2016 18:55:45 +0000 (12:55 -0600)]

docs: use latest libDRM version

Signed-off-by: Brian Paul <brianp@vmware.com>

commit | commitdiff | tree

Lars Hamre [Wed, 23 Mar 2016 14:14:23 +0000 (10:14 -0400)]

compiler/glsl: allow sequence op as a const expr in gles 1.0

Allow the sequence operator to be a constant expression in GLSL ES
versions prior to GLSL ES 3.0

Fixes the following piglit test:
/all/spec/glsl-es-1.0/compiler/array-sized-by-sequence-in-parenthesis.vert

This is similar to the logic from process_initializer() which performs
the same check for constant variable initialization with sequence
operators.

v2: Fixed regression pointed out by Eduardo Lima Mitev

Signed-off-by: Lars Hamre <chemecse@gmail.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>

commit | commitdiff | tree

Nicolai Hähnle [Mon, 21 Mar 2016 19:50:50 +0000 (14:50 -0500)]

radeonsi: fix out-of-bounds indexing of shader images

Results are undefined but may not crash. Without this change, out-of-bounds
indexing can lead to VM faults and GPU hangs.

Constant buffers, samplers, and possibly others will eventually need similar
treatment to support GL_ARB_robust_buffer_access_behavior.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Thu, 17 Mar 2016 01:47:47 +0000 (20:47 -0500)]

radeonsi: cache flush/invalidation for missing PIPE_BARRIER_*_BUFFER bits (v2)

This fixes arb_shader_image_load_store-host-mem-barrier.

v2: flush TC L2 for index buffers on <= CIK (Marek)

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Fri, 18 Mar 2016 00:49:26 +0000 (19:49 -0500)]

st/mesa: add missing MemoryBarrier bits and some explanations

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Fri, 18 Mar 2016 00:49:03 +0000 (19:49 -0500)]

gallium: add PIPE_BARRIER_STREAMOUT_BUFFER

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Marek Olšák [Tue, 22 Mar 2016 17:26:53 +0000 (18:26 +0100)]

radeonsi: fix 2D array MSAA failures since image support landed

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Jason Ekstrand [Thu, 17 Mar 2016 02:31:02 +0000 (19:31 -0700)]

i965/fs: Don't constant-fold RCP

No shader-db changes on Broadwell

Reviewed-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 16 Mar 2016 23:06:10 +0000 (16:06 -0700)]

i965: Remove the RCP+RSQ algebraic optimizations

NIR already has this optimization and it can do much better than the little
peephole in the backend.

No shader-db change on Haswell or Broadwell.

Reviewed-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Tue, 22 Mar 2016 23:21:21 +0000 (16:21 -0700)]

anv/device: Advertise version 1.0.5

Nothing substantial has changed since 1.0.2

commit | commitdiff | tree

Jason Ekstrand [Tue, 22 Mar 2016 23:17:09 +0000 (16:17 -0700)]

anv/device: Ignore the patch portion of the requested API version

Fixes dEQP-VK.api.device_init.create_instance_name_version

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94661

commit | commitdiff | tree

Jason Ekstrand [Tue, 22 Mar 2016 23:11:53 +0000 (16:11 -0700)]

anv: Don't assert-fail if someone asks for a non-existent entrypoint

commit | commitdiff | tree

Jason Ekstrand [Tue, 22 Mar 2016 23:06:53 +0000 (16:06 -0700)]

Update to the latest Vulkan header from Khronos

commit | commitdiff | tree

Ian Romanick [Wed, 2 Mar 2016 21:47:56 +0000 (13:47 -0800)]

nir: Don't abs slt and friends

No shader-db changes, but this is symmetric with the previous commit.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Ian Romanick [Wed, 2 Mar 2016 21:46:50 +0000 (13:46 -0800)]

nir: Don't abs the result of b2f or b2i

In the results below, 2 SIMD16 shaders in Trine are lost.

G4X
total instructions in shared programs: 4012279 -> 4011108 (-0.03%)
instructions in affected programs: 116776 -> 115605 (-1.00%)
helped: 339
HURT: 0

total cycles in shared programs: 84315862 -> 84313584 (-0.00%)
cycles in affected programs: 1767232 -> 1764954 (-0.13%)
helped: 274
HURT: 81

Ironlake
total instructions in shared programs: 6399073 -> 6396998 (-0.03%)
instructions in affected programs: 218050 -> 215975 (-0.95%)
helped: 600
HURT: 0

total cycles in shared programs: 128892088 -> 128888810 (-0.00%)
cycles in affected programs: 2867452 -> 2864174 (-0.11%)
helped: 422
HURT: 137

Sandy Bridge
total instructions in shared programs: 8462174 -> 8460759 (-0.02%)
instructions in affected programs: 178529 -> 177114 (-0.79%)
helped: 596
HURT: 0

total cycles in shared programs: 117542276 -> 117534098 (-0.01%)
cycles in affected programs: 1239166 -> 1230988 (-0.66%)
helped: 369
HURT: 150

Ivy Bridge
total instructions in shared programs: 7775131 -> 7773410 (-0.02%)
instructions in affected programs: 162903 -> 161182 (-1.06%)
helped: 590
HURT: 0

total cycles in shared programs: 65759882 -> 65747268 (-0.02%)
cycles in affected programs: 1004354 -> 991740 (-1.26%)
helped: 467
HURT: 141

Haswell
total instructions in shared programs: 7107786 -> 7106327 (-0.02%)
instructions in affected programs: 140954 -> 139495 (-1.04%)
helped: 590
HURT: 0

total cycles in shared programs: 64668028 -> 64655322 (-0.02%)
cycles in affected programs: 967080 -> 954374 (-1.31%)
helped: 452
HURT: 149

LOST:   2
GAINED: 0

Broadwell
total instructions in shared programs: 8980029 -> 8978287 (-0.02%)
instructions in affected programs: 197232 -> 195490 (-0.88%)
helped: 715
HURT: 0

total cycles in shared programs: 70070448 -> 70055970 (-0.02%)
cycles in affected programs: 975724 -> 961246 (-1.48%)
helped: 471
HURT: 111

LOST:   2
GAINED: 0

Skylake
total instructions in shared programs: 9115178 -> 9113436 (-0.02%)
instructions in affected programs: 203012 -> 201270 (-0.86%)
helped: 715
HURT: 0

total cycles in shared programs: 68848660 -> 68834004 (-0.02%)
cycles in affected programs: 993888 -> 979232 (-1.47%)
helped: 473
HURT: 116

LOST:   2
GAINED: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Ian Romanick [Wed, 2 Mar 2016 23:36:14 +0000 (15:36 -0800)]

nir: Simplify 0 < fabs(a)

Sandy Bridge / Ivy Bridge / Haswell
total instructions in shared programs: 8462180 -> 8462174 (-0.00%)
instructions in affected programs: 564 -> 558 (-1.06%)
helped: 6
HURT: 0

total cycles in shared programs: 117542462 -> 117542276 (-0.00%)
cycles in affected programs: 9768 -> 9582 (-1.90%)
helped: 12
HURT: 0

Broadwell / Skylake
total instructions in shared programs: 8980833 -> 8980826 (-0.00%)
instructions in affected programs: 626 -> 619 (-1.12%)
helped: 7
HURT: 0

total cycles in shared programs: 70077900 -> 70077714 (-0.00%)
cycles in affected programs: 9378 -> 9192 (-1.98%)
helped: 12
HURT: 0

G45 and Ironlake showed no change.

v2: Modify the comments to look more like a proof.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Ian Romanick [Wed, 2 Mar 2016 23:18:34 +0000 (15:18 -0800)]

nir: Simplify 0 >= b2f(a)

This also prevented some regressions with other patches in my local
tree.

Broadwell / Skylake
total instructions in shared programs: 8980835 -> 8980833 (-0.00%)
instructions in affected programs: 45 -> 43 (-4.44%)
helped: 1
HURT: 0

total cycles in shared programs: 70077904 -> 70077900 (-0.00%)
cycles in affected programs: 122 -> 118 (-3.28%)
helped: 1
HURT: 0

No changes on earlier platforms.

v2: Modify the comments to look more like a proof.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Ian Romanick [Wed, 2 Mar 2016 02:59:57 +0000 (18:59 -0800)]

nir: Simplify i2b with negated or abs operand

This enables removing ssa_201 and ssa_202 in sequences like:

                 vec1 ssa_200 = flt ssa_199, ssa_194
                 vec1 ssa_201 = b2i ssa_200
                 vec1 ssa_202 = i2b -ssa_201

shader-db results:

Sandy Bridge
total instructions in shared programs: 8462257 -> 8462180 (-0.00%)
instructions in affected programs: 3846 -> 3769 (-2.00%)
helped: 35
HURT: 0

total cycles in shared programs: 117542934 -> 117542462 (-0.00%)
cycles in affected programs: 20072 -> 19600 (-2.35%)
helped: 20
HURT: 1

Ivy Bridge
total instructions in shared programs: 7775252 -> 7775137 (-0.00%)
instructions in affected programs: 3645 -> 3530 (-3.16%)
helped: 35
HURT: 0

total cycles in shared programs: 65760522 -> 65760068 (-0.00%)
cycles in affected programs: 21082 -> 20628 (-2.15%)
helped: 25
HURT: 2

Haswell
total instructions in shared programs: 7108666 -> 7108589 (-0.00%)
instructions in affected programs: 3253 -> 3176 (-2.37%)
helped: 35
HURT: 0

total cycles in shared programs: 64675726 -> 64675272 (-0.00%)
cycles in affected programs: 21034 -> 20580 (-2.16%)
helped: 26
HURT: 1

Broadwell / Skylake
total instructions in shared programs: 8980912 -> 8980835 (-0.00%)
instructions in affected programs: 3223 -> 3146 (-2.39%)
helped: 35
HURT: 0

total cycles in shared programs: 70077926 -> 70077904 (-0.00%)
cycles in affected programs: 21886 -> 21864 (-0.10%)
helped: 21
HURT: 6

G45 and Ironlake showed no change.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Ian Romanick [Mon, 7 Mar 2016 21:09:30 +0000 (13:09 -0800)]

nir: Lower flrp with Boolean interpolator to bcsel

On Intel platforms that don't set lower_flrp, using bcsel instead of
flrp seems to be a small amount worse. On those platforms, the use of
flrp, bcsel, and multiply of b2f is still an active area of research.
In review, Matt suggested this is because bcsel turns into CMP+SEL, and
because of the flag register we can't schedule instructions well.

shader-db results:

G4X / Ironlake
total instructions in shared programs: 4016538 -> 4012279 (-0.11%)
instructions in affected programs: 161556 -> 157297 (-2.64%)
helped: 1077
HURT: 1

total cycles in shared programs: 84328296 -> 84315862 (-0.01%)
cycles in affected programs: 4174570 -> 4162136 (-0.30%)
helped: 926
HURT: 53

Unsurprisingly, no changes on later platforms.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Ian Romanick [Mon, 7 Mar 2016 18:55:21 +0000 (10:55 -0800)]

i965: Have NIR lower flrp on pre-GEN6 vec4 backend

Previously we were doing the lowering by hand in vec4_visitor::emit_lrp.
By doing it in NIR, we have the opportunity for NIR to do additional
optimization of the expanded code.

This also enables optimizations added by the next commit.

shader-db results:

G4X / Ironlake
total instructions in shared programs: 4024401 -> 4016538 (-0.20%)
instructions in affected programs: 447686 -> 439823 (-1.76%)
helped: 2623
HURT: 0

total cycles in shared programs: 84375846 -> 84328296 (-0.06%)
cycles in affected programs: 16964960 -> 16917410 (-0.28%)
helped: 2556
HURT: 41

Unsurprisingly, no changes on later platforms.

v2: Formatting and comment changes suggested by Matt.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Brian Paul [Tue, 22 Mar 2016 14:35:25 +0000 (08:35 -0600)]

swrast: fix discarded const warning in s_texture.c

Signed-off-by: Brian Paul <brianp@vmware.com>

commit | commitdiff | tree

Marc-André Lureau [Fri, 18 Mar 2016 19:01:07 +0000 (20:01 +0100)]

i965: fix invalid memory write

I noticed some heap corruption running virgl tests, and valgrind
helped me to track it down to the following error:

==29272== Invalid write of size 4
==29272==    at 0x90283D4: push_loop_stack (brw_eu_emit.c:1307)
==29272==    by 0x9029A7D: brw_DO (brw_eu_emit.c:1750)
==29272==    by 0x90554B0: fs_generator::generate_code(cfg_t const*, int) (brw_fs_generator.cpp:1999)
==29272==    by 0x904491F: brw_compile_fs (brw_fs.cpp:5685)
==29272==    by 0x8FC5DC5: brw_codegen_wm_prog (brw_wm.c:137)
==29272==    by 0x8FC7663: brw_fs_precompile (brw_wm.c:638)
==29272==    by 0x8FA4040: brw_shader_precompile(gl_context*, gl_shader_program*) (brw_link.cpp:51)
==29272==    by 0x8FA4A9A: brw_link_shader (brw_link.cpp:260)
==29272==    by 0x8DEF751: _mesa_glsl_link_shader (ir_to_mesa.cpp:3006)
==29272==    by 0x8C84325: _mesa_link_program (shaderapi.c:1042)
==29272==    by 0x8C851D7: _mesa_LinkProgram (shaderapi.c:1515)
==29272==    by 0x4E4B8E8: add_shader_program (vrend_renderer.c:880)
==29272==  Address 0xf2f3cb0 is 0 bytes after a block of size 112 alloc'd
==29272==    at 0x4C2AA98: calloc (vg_replace_malloc.c:711)
==29272==    by 0x8ED11F7: ralloc_size (ralloc.c:113)
==29272==    by 0x8ED1282: rzalloc_size (ralloc.c:134)
==29272==    by 0x8ED14C0: rzalloc_array_size (ralloc.c:196)
==29272==    by 0x9019C7B: brw_init_codegen (brw_eu.c:291)
==29272==    by 0x904F565: fs_generator::fs_generator(brw_compiler const*, void*, void*, void const*, brw_stage_prog_data*, unsigned int, bool, gl_shader_stage) (brw_fs_generator.cpp:124)
==29272==    by 0x9044883: brw_compile_fs (brw_fs.cpp:5675)
==29272==    by 0x8FC5DC5: brw_codegen_wm_prog (brw_wm.c:137)
==29272==    by 0x8FC7663: brw_fs_precompile (brw_wm.c:638)
==29272==    by 0x8FA4040: brw_shader_precompile(gl_context*, gl_shader_program*) (brw_link.cpp:51)
==29272==    by 0x8FA4A9A: brw_link_shader (brw_link.cpp:260)
==29272==    by 0x8DEF751: _mesa_glsl_link_shader (ir_to_mesa.cpp:3006)

if_depth_in_loop is an array of size p->loop_stack_array_size, and
push_loop_stack() will access if_depth_in_loop[p->loop_stack_depth+1],
thus the condition to grow the array should be
p->loop_stack_array_size <= (p->loop_stack_depth + 1) (it's currently
off by 2...)

This can be reproduced by running the following test with virgl test
server:
LIBGL_ALWAYS_SOFTWARE=y GALLIUM_DRIVER=virpipe bin/shader_runner
./tests/shaders/glsl-fs-unroll-explosion.shader_test -auto

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Dave Airlie [Tue, 22 Mar 2016 00:28:44 +0000 (10:28 +1000)]

tgsi: drop unused set_exec/kill_mask interfaces.

These don't get used and haven't been in git history from what I can
see, so drop them.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Dave Airlie [Mon, 21 Mar 2016 23:53:47 +0000 (09:53 +1000)]

docs/relnotes: update ARB_internalformat_query2 status.

Signed-off-by: Dave Airlie <Airlied@redhat.com>

commit | commitdiff | tree

Dave Airlie [Fri, 4 Mar 2016 02:33:46 +0000 (12:33 +1000)]

st/mesa: add support for internalformat query2.

Add code to handle GL_INTERNALFORMAT_PREFERRED.
Add code to deal with GL_RENDERBUFFER being passes into ChooseTextureFormat.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Jason Ekstrand [Fri, 18 Mar 2016 23:32:46 +0000 (16:32 -0700)]

anv/batch_chain: Fall back to growing batches when chaining isn't available

commit | commitdiff | tree

Anuj Phogat [Fri, 11 Mar 2016 23:24:36 +0000 (15:24 -0800)]

i965: Fix assert conditions for src/dst x/y offsets

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Anuj Phogat [Mon, 14 Mar 2016 17:25:50 +0000 (10:25 -0700)]

swrast: Move assert for 'slice' in to check_map_teximage

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>

commit | commitdiff | tree

xavier [Wed, 9 Mar 2016 08:58:48 +0000 (09:58 +0100)]

r600/sb: Do not distribute neg in expr_handler::fold_assoc() when folding multiplications.

Previously it was doing this transformation for a Trine 3 shader:
     MUL     R6.x.12,    R13.x.23, 0.5|3f000000
-    MULADD     R4.x.12,    -R6.x.12, 2|40000000, 1|3f800000
+    MULADD     R4.x.12,    -R13.x.23, -1|bf800000, 1|3f800000

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94412
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>
Cc: "11.0 11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Samuel Pitoiset [Mon, 21 Mar 2016 12:15:44 +0000 (13:15 +0100)]

nvc0: make sure to delete samplers used by compute shaders

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>

commit | commitdiff | tree

Kenneth Graunke [Thu, 17 Mar 2016 03:19:50 +0000 (20:19 -0700)]

i965/blorp: Make BlitFramebuffer() do sRGB encoding in ES 3.x.

According to the ES 3.0 and GL 4.4 specifications, glBlitFramebuffer
is supposed to perform sRGB decoding and encoding whenever sRGB formats
are in use.  The ES 3.0 specification is completely clear, and has
always stated this.

However, the GL specification has changed behavior in 4.1, 4.2, and
4.4.  The original behavior stated that no sRGB encoding should occur.
The 4.4 behavior matches ES 3.0's wording.  However, implementing the
new behavior appears to break applications such as Left 4 Dead 2.

This patch changes Meta to apply the ES 3.x rules in ES 3.x, but
leaves OpenGL alone for now, to avoid breaking applications.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>

commit | commitdiff | tree

Kenneth Graunke [Thu, 17 Mar 2016 03:15:52 +0000 (20:15 -0700)]

i965/blorp: Refactor sRGB encoding/decoding.

Because the rules for sRGB are so insane, we change brw_blorp_miptrees
to take decode_srgb and encode_srgb flags, which control linearization
of the source and destination separately.

This should make it easy to implement whatever crazy combination of
rules people throw at us. For now, it should be equivalent.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>

commit | commitdiff | tree

Kenneth Graunke [Tue, 8 Mar 2016 08:34:14 +0000 (00:34 -0800)]

meta: Make BlitFramebuffer() do sRGB encoding in ES 3.x.

According to the ES 3.0 and GL 4.4 specifications, glBlitFramebuffer
is supposed to perform sRGB decoding and encoding whenever sRGB formats
are in use.  The ES 3.0 specification is completely clear, and has
always stated this.

However, the GL specification has changed behavior in 4.1, 4.2, and
4.4.  The original behavior stated that no sRGB encoding should occur.
The 4.4 behavior matches ES 3.0's wording.  However, implementing the
new behavior appears to break applications such as Left 4 Dead 2.

This patch changes Meta to apply the ES 3.x rules in ES 3.x, but
leaves OpenGL alone for now, to avoid breaking applications.

Meta implements several other functions in terms of BlitFramebuffer,
and many of those explicitly do not perform sRGB encoding.  So, this
patch explicitly disables sRGB encoding in those other functions,
preserving the existing (correct) behavior.

If you're from the future and are reading this, hi!  Welcome to
the "fun" of debugging sRGB problems!  Best of luck!

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Nicolai Hähnle [Tue, 15 Mar 2016 18:08:21 +0000 (13:08 -0500)]

docs: mark GL_ARB_shader_image_load_store/_size as done for radeonsi

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Edward O'Callaghan [Sun, 10 Jan 2016 13:50:32 +0000 (00:50 +1100)]

radeonsi: Set PIPE_SHADER_CAP_MAX_SHADER_IMAGES

This enables ARB_shader_image_load_store and ARB_shader_image_size.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
[allow the same number of images for all shader stages and require LLVM 3.9]

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Wed, 16 Mar 2016 01:58:12 +0000 (20:58 -0500)]

radeonsi: disable early Z if the fragment shader writes to memory

Empirically, both the EXEC_ON_* flags and LATE_Z are necessary.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

RSS Atom