git.libre-soc.org Git - mesa.git/log

projects / mesa.git / log

commit | commitdiff | tree

Jason Ekstrand [Wed, 5 Nov 2014 01:18:48 +0000 (17:18 -0800)]

nir: Add a function for rewriting all the uses of a SSA def

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Tue, 4 Nov 2014 19:02:09 +0000 (11:02 -0800)]

nir: Automatically handle SSA uses when an instruction is inserted

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Tue, 4 Nov 2014 18:40:48 +0000 (10:40 -0800)]

nir: Add an initialization function for SSA definitions

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 29 Oct 2014 21:17:17 +0000 (14:17 -0700)]

nir: Add an SSA-based liveness analysis pass.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Fri, 31 Oct 2014 04:18:22 +0000 (21:18 -0700)]

nir: set reg_alloc and ssa_alloc when indexing registers and SSA values

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 29 Oct 2014 23:25:51 +0000 (16:25 -0700)]

nir: Add a function to detect if a block is immediately followed by an if

Since we don't actually have an "if" instruction, this is a very common
pattern when iterating over instructions. This adds a helper function for
it to make things a little less painful.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 29 Oct 2014 21:16:54 +0000 (14:16 -0700)]

nir: Add a foreach_block_reverse function

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 29 Oct 2014 21:16:39 +0000 (14:16 -0700)]

nir/foreach_block: Return false if the callback on the last block fails

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 29 Oct 2014 19:42:54 +0000 (12:42 -0700)]

nir: Add a basic metadata management system

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 29 Oct 2014 19:42:33 +0000 (12:42 -0700)]

nir/lower_variables_scalar: Silence a compiler warning

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 22 Oct 2014 18:24:33 +0000 (11:24 -0700)]

i965/fs_nir: Convert the shader to/from SSA

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 22 Oct 2014 19:57:28 +0000 (12:57 -0700)]

nir: Add a lower_vec_to_movs pass

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 22 Oct 2014 18:22:53 +0000 (11:22 -0700)]

nir: Add a naieve from-SSA pass

This pass is kind of stupidly implemented but it should be enough to get us
up and going. We probably want something better that doesn't generate all
of the redundant moves eventually. However, the i965 backend should be
able to handle the movs, so I'm not too worried about it in the short term.

commit | commitdiff | tree

Jason Ekstrand [Tue, 21 Oct 2014 01:07:28 +0000 (18:07 -0700)]

i965/fs_nir: Don't duplicate emit_general_interpolation

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Tue, 21 Oct 2014 01:05:36 +0000 (18:05 -0700)]

i965/fs: Don't take an ir_variable for emit_general_interpolation

Previously, emit_general_interpolation took an ir_variable and pulled the
information it needed from that. This meant that in fs_fp, we were
constructing a dummy ir_variable just to pass into it. This commit makes
emit_general_interpolation take only the information it needs and gets rid
of the fs_fp cruft.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Sat, 18 Oct 2014 00:11:34 +0000 (17:11 -0700)]

nir: Add intrinsics to do alternate interpolation on inputs

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Thu, 16 Oct 2014 23:53:03 +0000 (16:53 -0700)]

nir: Add NIR_TRUE and NIR_FALSE constants and use them for boolean immediates

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Thu, 16 Oct 2014 04:52:58 +0000 (21:52 -0700)]

i965/fs_nir: Add atomic counters support

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Thu, 16 Oct 2014 16:56:14 +0000 (09:56 -0700)]

nir/lower_atomics: Multiply array offsets by ATOMIC_COUNTER_SIZE

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 15 Oct 2014 21:44:00 +0000 (14:44 -0700)]

i965/fs_nir: Handle coarse/fine derivatives

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 15 Oct 2014 23:57:10 +0000 (16:57 -0700)]

nir/glsl: Add support for coarse and fine derivatives

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 15 Oct 2014 23:56:43 +0000 (16:56 -0700)]

nir: Add fine and coarse derivative opcodes

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 15 Oct 2014 23:19:26 +0000 (16:19 -0700)]

nir/glsl: Add support for saturate

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 15 Oct 2014 23:01:04 +0000 (16:01 -0700)]

i965/fs_nir: Add support for sample_pos and sample_id

commit | commitdiff | tree

Jason Ekstrand [Wed, 15 Oct 2014 22:36:43 +0000 (15:36 -0700)]

Fix up varying pull constants

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 15 Oct 2014 20:56:48 +0000 (13:56 -0700)]

Fix what I think are a few NIR typos

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 15 Oct 2014 22:25:10 +0000 (15:25 -0700)]

i965/fs_nir: Use the correct texture offset immediate

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 15 Oct 2014 19:18:25 +0000 (12:18 -0700)]

i965/fs_nir: Use the correct types for texture inputs

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 15 Oct 2014 17:41:04 +0000 (10:41 -0700)]

i965/fs_nir: Make the sampler register always unsigned

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Tue, 14 Oct 2014 23:40:04 +0000 (16:40 -0700)]

i965/fs: Only use nir for 8-wide non-fast-clear shaders.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Connor Abbott [Fri, 15 Aug 2014 17:32:07 +0000 (10:32 -0700)]

i965/fs: add a NIR frontend

This is similar to the GLSL IR frontend, except consuming NIR. This lets
us test NIR as part of an actual compiler.

v2: Jason Ekstrand <jason.ekstrand@intel.com>:
   Make brw_fs_nir build again
   Only use NIR of INTEL_USE_NIR is set
   whitespace fixes

commit | commitdiff | tree

Connor Abbott [Fri, 15 Aug 2014 17:17:26 +0000 (10:17 -0700)]

i965/fs: Don't pass through the coordinate type

All we really need is the number of components.

commit | commitdiff | tree

Connor Abbott [Tue, 5 Aug 2014 18:02:02 +0000 (11:02 -0700)]

i965/fs: make emit_fragcoord_interpolation() not take an ir_variable

commit | commitdiff | tree

Connor Abbott [Thu, 24 Jul 2014 22:51:58 +0000 (15:51 -0700)]

nir: add an SSA-based dead code elimination pass

v2: Jason Ekstrand <jason.ekstrand@intel.com>:
whitespace fixes

commit | commitdiff | tree

Connor Abbott [Wed, 23 Jul 2014 18:19:50 +0000 (11:19 -0700)]

nir: add an SSA-based copy propagation pass

commit | commitdiff | tree

Connor Abbott [Tue, 22 Jul 2014 21:05:06 +0000 (14:05 -0700)]

nir: add a pass to convert to SSA

v2: Jason Ekstrand <jason.ekstrand@intel.com>:
whitespace fixes

commit | commitdiff | tree

Connor Abbott [Fri, 18 Jul 2014 23:13:11 +0000 (16:13 -0700)]

nir: calculate dominance information

commit | commitdiff | tree

Connor Abbott [Wed, 30 Jul 2014 19:08:13 +0000 (12:08 -0700)]

nir: add an optimization to turn global registers into local registers

After linking and inlining, this allows us to convert these registers
into SSA values and optimise more code.

commit | commitdiff | tree

Connor Abbott [Wed, 30 Jul 2014 21:43:26 +0000 (14:43 -0700)]

nir: add a pass to lower atomics

v2: Jason Ekstrand <jason.ekstrand@intel.com>
whitespace fixes

commit | commitdiff | tree

Connor Abbott [Wed, 30 Jul 2014 19:07:45 +0000 (12:07 -0700)]

nir: add a pass to lower system value reads

v2: Jason Ekstrand <jason.ekstrand@intel.com>:
whitespace fixes

commit | commitdiff | tree

Connor Abbott [Wed, 30 Jul 2014 19:04:49 +0000 (12:04 -0700)]

nir: add a pass to lower sampler instructions

commit | commitdiff | tree

Connor Abbott [Wed, 30 Jul 2014 18:56:52 +0000 (11:56 -0700)]

nir: add a pass to remove unused variables

After we lower variables, we want to delete them in order to free up
some memory.

v2: Jason Ekstrand <jason.ekstrand@intel.com>:
whitespace fixes

commit | commitdiff | tree

Connor Abbott [Tue, 5 Aug 2014 17:54:27 +0000 (10:54 -0700)]

nir: keep track of the number of input, output, and uniform slots

commit | commitdiff | tree

Connor Abbott [Thu, 17 Jul 2014 16:12:52 +0000 (09:12 -0700)]

nir: add a pass to lower variables for scalar backends

commit | commitdiff | tree

Connor Abbott [Fri, 11 Jul 2014 01:18:17 +0000 (18:18 -0700)]

nir: add a glsl-to-nir pass

v2: Jason Ekstrand <jason.ekstrand@intel.com>:
Make glsl_to_nir build again
fix whitespace

commit | commitdiff | tree

Connor Abbott [Wed, 30 Jul 2014 22:20:53 +0000 (15:20 -0700)]

nir: add a validation pass

This is similar to ir_validate.cpp.

v2: Jason Ekstrand <jason.ekstrand@intel.com>:
whitespace fixes

commit | commitdiff | tree

Connor Abbott [Wed, 30 Jul 2014 22:29:27 +0000 (15:29 -0700)]

nir: add a printer

This is similar to ir_print_visitor.cpp.

v2: Jason Ekstrand <jason.ekstrand@intel.com>:
whitespace fixes

commit | commitdiff | tree

Jason Ekstrand [Thu, 18 Dec 2014 01:30:27 +0000 (17:30 -0800)]

SQUASH: Fix comments from eric

Reviewed-by: Eric Anholt <eric@anholt.net>

commit | commitdiff | tree

Jason Ekstrand [Wed, 29 Oct 2014 21:15:13 +0000 (14:15 -0700)]

SQUASH: Add an assert

commit | commitdiff | tree

Connor Abbott [Thu, 31 Jul 2014 23:16:23 +0000 (16:16 -0700)]

nir: add core helper functions

These include functions for adding and removing various bits of IR and
helpers for iterating over all the sources and destinations of an
instruction. This is similar to ir.cpp.

v2: Jason Ekstrand <jason.ekstrand@intel.com>:
whitespace and automake fixes

commit | commitdiff | tree

Jason Ekstrand [Wed, 26 Nov 2014 23:08:19 +0000 (15:08 -0800)]

SQUASH: Use the enum for the variable mode

commit | commitdiff | tree

Connor Abbott [Thu, 31 Jul 2014 23:14:51 +0000 (16:14 -0700)]

nir: add the core datastructures

This includes all the instructions, ifs, loops, functions, etc. This is
similar to the information in ir.h.

v2: Jason Ekstrand <jason.ekstrand@intel.com>:
Include ralloc and hash_table from the util directory
whitespace fixes

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-By glenn.kennard <glenn.kennard@gmail.com>

commit | commitdiff | tree

Connor Abbott [Wed, 30 Jul 2014 22:33:32 +0000 (15:33 -0700)]

nir: add a simple C wrapper around glsl_types.h

v2: Jason Ekstrand <jason.ekstrand@intel.com>:
whitespace and automake fixes

Reviewed-by: Eric Anholt <eric@anholt.net>

commit | commitdiff | tree

Connor Abbott [Wed, 30 Jul 2014 22:32:21 +0000 (15:32 -0700)]

nir: add initial README

Reviewed-by: Eric Anholt <eric@anholt.net>

commit | commitdiff | tree

Connor Abbott [Tue, 22 Jul 2014 00:11:53 +0000 (17:11 -0700)]

exec_list: add a list_foreach_typed_reverse() macro

Reviewed-by: Eric Anholt <eric@anholt.net>

commit | commitdiff | tree

Eric Anholt [Tue, 13 Jan 2015 22:23:43 +0000 (11:23 +1300)]

vc4: Add some dumping for STORE_TILE_BUFFER_GENERAL.

commit | commitdiff | tree

Eric Anholt [Tue, 13 Jan 2015 21:53:20 +0000 (10:53 +1300)]

vc4: Add dumping for the TILE_RENDERING_MODE_CONFIG packet.

I wanted to read it, so I wrote parsing.

commit | commitdiff | tree

Eric Anholt [Tue, 13 Jan 2015 21:06:02 +0000 (10:06 +1300)]

vc4: Fix CL dumping trying to dump too far.

Execution will end at the cl->next, because that's what ct0ea/ct1ea get
programmed to.

commit | commitdiff | tree

Eric Anholt [Tue, 13 Jan 2015 03:43:16 +0000 (16:43 +1300)]

vc4: Fix texture type masking.

Everything from ETC1 to RGBA64 was getting its top bit dropped, but we
didn't use any of those formats.

commit | commitdiff | tree

Eric Anholt [Mon, 12 Jan 2015 01:53:48 +0000 (14:53 +1300)]

vc4: Colormask should apply after all other fragment ops (like logic op).

Theoretically it should apply after dithering as well, but ditehring for
565 happens in fixed function in the TLB store.

commit | commitdiff | tree

Eric Anholt [Sun, 11 Jan 2015 20:14:41 +0000 (09:14 +1300)]

vc4: No turning unpack arguments into small immediates.

Since unpack only happens on things read from the A register file, we have
to leave them as something that can be allocated to A (temp or uniform).

commit | commitdiff | tree

Eric Anholt [Sun, 11 Jan 2015 20:10:35 +0000 (09:10 +1300)]

vc4: Move the tests for src needing to be an A register to vc4_qir.c.

I want it from another location.

commit | commitdiff | tree

Eric Anholt [Sun, 11 Jan 2015 20:16:26 +0000 (09:16 +1300)]

vc4: Don't swap the raddr on instructions doing unpacks.

It would mean different unpacking behavior, since only the A file does
unpack (with PM==0).

commit | commitdiff | tree

Eric Anholt [Sun, 11 Jan 2015 06:31:59 +0000 (19:31 +1300)]

vc4: Don't let pairing happen with badly mismatched unpack flags.

No difference on shader-db, but prevents definite regressions in the
blending changes.

commit | commitdiff | tree

Eric Anholt [Sun, 11 Jan 2015 05:27:07 +0000 (18:27 +1300)]

vc4: Don't let pairing happen with badly mismatched pack flags.

No difference on shader-db, but will become more important as I introduce
more use of pack flags with the blending changes.

commit | commitdiff | tree

Eric Anholt [Wed, 14 Jan 2015 04:11:59 +0000 (17:11 +1300)]

vc4: Fix early Z behavior on hardware.

It turns out the simulator was not treating this bit the same as the RPi,
and I'd forgotten to remove it when turning on early Z. The result was
that you'd get big chunks of your rendering missing.

commit | commitdiff | tree

Michel Dänzer [Tue, 13 Jan 2015 07:38:52 +0000 (16:38 +0900)]

Revert "radeonsi: only set BC_OPTIMIZE_DISABLE when necessary"

This reverts commit 0543630d0b0d9d9f6eefbc14fbd3385d4de37ba0.

It caused flickering artifacts in Steam games such as Team Fortress 2 or
Left 4 Dead 2.

We could probably only enable this optimization by also making sure the
shader code only uses either SI_PARAM_LINEAR_CENTROID or
SI_PARAM_LINEAR_CENTER, not both. This would probably require a shader
variant.

Sorry I didn't remember this when reviewing the reverted change.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Michel Dänzer [Thu, 15 Jan 2015 03:57:05 +0000 (12:57 +0900)]

st/clover: Adapt to TargetLibraryInfo.h move in LLVM SVN r226078

Trivial.

commit | commitdiff | tree

Ian Romanick [Fri, 7 Nov 2014 06:51:45 +0000 (22:51 -0800)]

mesa: Micro-optimize _mesa_is_valid_prim_mode

You would not believe the mess GCC 4.8.3 generated for the old
switch-statement.

On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Batch7:

32-bit: Difference at 95.0% confidence -0.37374% +/- 0.184057% (n=40)
64-bit: Difference at 95.0% confidence 0.966722% +/- 0.338442% (n=40)

The regression on 32-bit is odd. Callgrind says the caller,
_mesa_is_valid_prim_mode is faster. Before it says 2,293,760
cycles, and after it says 917,504.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Ian Romanick [Tue, 11 Nov 2014 10:29:34 +0000 (10:29 +0000)]

mesa: Check for vertex program the same way in desktop GL and ES

On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Multithread:

32-bit: Difference at 95.0% confidence 0.416027% +/- 0.163529% (n=40)
64-bit: Difference at 95.0% confidence 0.494771% +/- 0.259985% (n=40)

Gl32Batch7 had no difference proven at 95.0% confidence (n=120) on
32-bit or 64-bit.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Ian Romanick [Tue, 11 Nov 2014 09:21:40 +0000 (09:21 +0000)]

mesa: Drop index buffer bounds check

The previous check was insufficient (as it did not take 'indices' into
consideration), and DX10 hardware does not need this check anyway.

Since index_bytes is no longer used, remove it.

On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Batch7:

32-bit: Difference at 95.0% confidence 1.66929% +/- 0.230107% (n=40)
64-bit: Difference at 95.0% confidence -1.40848% +/- 0.288038% (n=40)

The regression on 64-bit is odd. Callgrind says the caller,
validate_DrawElements_common is faster. Before it says 10,321,920
cycles, and after it says 8,945,664.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Ian Romanick [Tue, 11 Nov 2014 11:28:28 +0000 (11:28 +0000)]

mesa: Only check for a current vertex shader in core profile

This doesn't affect performance, but it feels more correct.

On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Batch7:

32-bit: No difference proven at 95.0% confidence (n=120)
64-bit: No difference proven at 95.0% confidence (n=120)

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Ian Romanick [Tue, 11 Nov 2014 12:31:22 +0000 (12:31 +0000)]

mesa: Only validate shaders that can exist in the context

On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Batch7:

32-bit: Difference at 95.0% confidence 0.495267% +/- 0.202063% (n=40)
64-bit: Difference at 95.0% confidence 3.57576% +/- 0.288175% (n=40)

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Ian Romanick [Tue, 11 Nov 2014 14:51:29 +0000 (14:51 +0000)]

i965: Store the atoms directly in the context

Instead of having an extra pointer indirection in one of the hottest
loops in the driver.

On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Batch7:

32-bit: Difference at 95.0% confidence 1.98515% +/- 0.20814% (n=40)
64-bit: Difference at 95.0% confidence 1.5163% +/- 0.811016% (n=60)

v2 (Ken): Cut size of array from 64 to 57 to save memory.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Ian Romanick [Mon, 10 Nov 2014 14:06:47 +0000 (06:06 -0800)]

i965: Micro-optimize brw_get_index_type

With the switch-statement, GCC 4.8.3 produces a small pile of code with
a branch.

00000000 <brw_get_index_type>:
  000000:       8b 54 24 04             mov    0x4(%esp),%edx
  000004:       b8 01 00 00 00          mov    $0x1,%eax
  000009:       81 fa 03 14 00 00       cmp    $0x1403,%edx
  00000f:       74 0d                   je     00001e <brw_get_index_type+0x1e>
  000011:       31 c0                   xor    %eax,%eax
  000013:       81 fa 05 14 00 00       cmp    $0x1405,%edx
  000019:       0f 94 c0                sete   %al
  00001c:       01 c0                   add    %eax,%eax
  00001e:       c3                      ret

However, this could be two instructions.

00000000 <brw_get_index_type>:
  000000:       2d 01 14 00 00          sub    $0x1401,%eax
  000005:       d1 e8                   shr    %eax
  000007:       90                      nop
  000008:       90                      nop
  000009:       90                      nop
  00000a:       90                      nop
  00000b:       c3                      ret

The function was also moved to the header so that it could be inlined at
the two call sites.  Without this, 32-bit also needs to pull the
parameter from the stack.  This means there is a push, a call, a move,
and a ret added to a two instruction function.  The above code shows the
function with __attribute__((regparm=1)), but even this adds several
extra instructions.  There is also an extra instruction on 64-bit to
move the parameter to %eax for the subtract.

On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Batch7:

32-bit: Difference at 95.0% confidence 0.818589% +/- 0.234661% (n=40)
64-bit: Difference at 95.0% confidence 0.54554% +/- 0.354092% (n=40)

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Ian Romanick [Tue, 11 Nov 2014 14:14:14 +0000 (14:14 +0000)]

meta: Put _mesa_meta_in_progress in the header file

...so that it can be inlined in the two places that call it.

On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Batch7:

32-bit: No difference proven at 95.0% confidence (n=120)
64-bit: Difference at 95.0% confidence 1.24042% +/- 0.382277% (n=40)

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Kenneth Graunke [Tue, 13 Jan 2015 22:56:54 +0000 (14:56 -0800)]

i965: Fix "vertex" vs. "geometry" and "VS" vs. "GS" in debug output.

We were happily printing "Native code for unnamed vertex shader" and
"VS vec4" program for geometry shaders in our INTEL_DEBUG=gs output,
as well as the KHR_debug output used by shader-db.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Kenneth Graunke [Tue, 13 Jan 2015 22:28:13 +0000 (14:28 -0800)]

i965: Pass a shader stage abbreviation to fs_generator().

A lot of messages hardcoded the string "FS", which is confusing on
Broadwell, where we use this code for VS support as well.

shader-db particularly got confused, as it reported two "FS SIMD8"
shaders, and no vertex shaders at all. Craziness ensued.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Samuel Iglesias Gonsalvez [Tue, 13 Jan 2015 10:02:27 +0000 (11:02 +0100)]

configure: add check for GNU indent

Only GNU indent is supported when indenting autogenerated format_pack.c
and format_unpack.c files. Some non-GNU indent (Mac OS X and FreeBSD)
add extra whitespaces than break the build of those files.

Fallback to 'cat' if a non-GNU indent is found.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=88335
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Samuel Iglesias Gonsalvez [Wed, 14 Jan 2015 06:52:13 +0000 (07:52 +0100)]

configure: change required Python Mako version to 0.3.4

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Iago Toral Quiroga [Tue, 13 Jan 2015 07:33:19 +0000 (08:33 +0100)]

mesa: rename RGBA8888_* format constants to something appropriate.

The 8888 suggests 8-bit components which is not correct, so
replace that with the actual size of the components in each
format.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Tue, 13 Jan 2015 01:10:22 +0000 (17:10 -0800)]

i965/miptree_map_blit: Don't do the initial copy if INVALIDATE_RANGE is set

Before we were always coping from the buffer being mapped into the
temporary buffer. However, if INVALIDATE_RANGE is set, then we know that
the data is going to be junk after we unmap so there's no point in doing
the blit. This is important because doing the blit will cause a stall 3
lines later when we map the buffer.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Tapani Pälli [Tue, 25 Nov 2014 11:10:30 +0000 (06:10 -0500)]

mesa/glsl/glapi: enable GL_EXT_draw_buffers extension

Patch enables ES2 extension that utilizes existing ES3 functionality.

Changes make all the subtests to run and pass in WebGL conformance
test 'webgl-draw-buffers' when running Chrome on OpenGL ES, also
Piglit test 'draw_buffers_gles2' passes.

v2: remove unused boolean (Ilia Mirkin)
v3: proper error checking for invalid values (Chad Versace)
v4: run error check explicitly for ES2 and ES3 (Kenneth Graunke)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Thu, 16 Oct 2014 18:45:44 +0000 (11:45 -0700)]

i965/fs: Allow constant propagation between different types

This will be needed for NIR because it is typeless and treats all constants
as uint32 values and reinterprets them when they are used later. This
commit allows those values to be properly propagated.

Also, this helps some synmark shaders because it allows us to copy
propagate a 0x00000000UD into a 0.0F in a load_payload, which then lets us
combine 4 load_payloads.

instructions in affected programs: 2288 -> 2144 (-6.29%)

Reviewed-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Chad Versace [Tue, 13 Jan 2015 19:30:55 +0000 (11:30 -0800)]

egl/wayland: Fix unused variable warnings

Remove ctx variables unused as of 70e8ccc459.

commit | commitdiff | tree

Mike Mason [Mon, 12 Jan 2015 22:37:28 +0000 (14:37 -0800)]

mesa: Enable GL_RGB/GL_RGBA in GLES3 glGetInternalformativ

Removes commit 7894278 changes and moves fix to _mesa_GetInternalformativ().
The original commit enabled the GL_RGB and GL_RGBA unsized internal formats
as valid for render buffers in GLES3, but this is incorrect. They should
have only been enabled for GetInternalformativ()

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88079
Reviewed-by: Chad Versace <chad.versace@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Rob Clark [Tue, 13 Jan 2015 04:32:25 +0000 (23:32 -0500)]

freedreno/ir3: handle "holes" in inputs

If, for example, only the x/y/w components of in.xyzw are actually used,
we still need to have a group of four registers and assign all four
components. The hardware can't write in.xy and in.w to discontiguous
registers. To handle this, pad with a dummy NOP instruction, to keep
the neighbor chain contiguous.

This fixes a problem noticed with firefox OMTC.

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Iago Toral Quiroga [Mon, 15 Dec 2014 08:29:55 +0000 (09:29 +0100)]

mesa: Fix error reporting for some cases of incomplete FBO attachments

According to the OpenGL and OpenGL ES specs (sections
"FRAMEBUFFER COMPLETENESS" and "Whole Framebuffer Completeness"),
the image for color, depth or stencil attachments must be renderable,
otherwise the attachment is considered incomplete and we should report
GL_FRAMEBUFFER_INCOMPLETE_ATTACHMENT. Currently, we detect this
situation properly but report a different error.

This fixes the following 3 piglit tests:
dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.rgb_unsigned_int_2_10_10_10_rev
dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.rgba_unsigned_int_2_10_10_10_rev
dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.rgb16f

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

commit | commitdiff | tree

Eduardo Lima Mitev [Thu, 11 Dec 2014 22:34:20 +0000 (23:34 +0100)]

mesa: Returns a GL_INVALID_VALUE error if num of texs in glDeleteTextures is negative

Per GLES3 manual for glDeleteTextures
<https://www.khronos.org/opengles/sdk/docs/man3/html/glDeleteTextures.xhtml>,
GL_INVALID_VALUE is generated if n is negative.

Fixes 1 dEQP test:
* dEQP-GLES3.functional.negative_api.texture.deletetextures

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

commit | commitdiff | tree

Eduardo Lima Mitev [Thu, 11 Dec 2014 22:34:18 +0000 (23:34 +0100)]

mesa: Returns a GL_INVALID_VALUE error if num of fbos in glDeleteRenderbuffers is negative

Per GLES3 manual for glDeleteRenderbuffers
<https://www.khronos.org/opengles/sdk/docs/man3/html/glDeleteRenderbuffers.xhtml>,
GL_INVALID_VALUE is generated if n is negative.

Fixes 1 dEQP test:
* dEQP-GLES3.functional.negative_api.buffer.delete_renderbuffers

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

commit | commitdiff | tree

Eduardo Lima Mitev [Thu, 11 Dec 2014 22:34:17 +0000 (23:34 +0100)]

mesa: Returns a GL_INVALID_VALUE error if num of fbos in glDeleteFramebuffers is negative

Per GLES3 manual for glDeleteFramebuffers
<https://www.khronos.org/opengles/sdk/docs/man3/html/glDeleteFramebuffers.xhtml>,
GL_INVALID_VALUE is generated if n is negative.

Fixes 1 dEQP test:
* dEQP-GLES3.functional.negative_api.buffer.delete_framebuffers

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

commit | commitdiff | tree

Eduardo Lima Mitev [Thu, 11 Dec 2014 22:34:16 +0000 (23:34 +0100)]

mesa: Allows querying GL_SAMPLER_BINDING on GLES3 profile

From GLES3 specification (page 123), "The currently bound sampler may be
queried by calling GetIntegerv with pname set to
SAMPLER_BINDINGGL_SAMPLER_BINDING".

Fixes 4 dEQP tests:
* dEQP-GLES3.functional.state_query.integers.sampler_binding_getboolean
* dEQP-GLES3.functional.state_query.integers.sampler_binding_getinteger
* dEQP-GLES3.functional.state_query.integers.sampler_binding_getinteger64
* dEQP-GLES3.functional.state_query.integers.sampler_binding_getfloat

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

commit | commitdiff | tree

Samuel Iglesias Gonsalvez [Thu, 11 Dec 2014 22:34:15 +0000 (23:34 +0100)]

main: round floating-point value to nearest integer in glGetSamplerParameteriv()

Previously, a cast was done to convert from float to int but there
were rounding errors.

The spec specificies in Data Conversion chapter that Floating-point values are
rounded to the nearest integer.

This patch fixes the following 2 dEQP tests:

dEQP-GLES3.functional.state_query.sampler.sampler_texture_min_lod_getsamplerparameteri
dEQP-GLES3.functional.state_query.sampler.sampler_texture_max_lod_getsamplerparameteri

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

commit | commitdiff | tree

Samuel Iglesias Gonsalvez [Thu, 11 Dec 2014 22:34:14 +0000 (23:34 +0100)]

main: round floating-point value to nearest integer in glGetTexParameteriv()

Previously, a cast was done to convert from float to int but there
were rounding errors.

The spec specificies in Data Conversion chapter that Floating-point values are
rounded to the nearest integer.

This patch fixes the following 8 dEQP tests:

dEQP-GLES3.functional.state_query.texture.texture_2d_texture_min_lod_gettexparameteri
dEQP-GLES3.functional.state_query.texture.texture_2d_texture_max_lod_gettexparameteri
dEQP-GLES3.functional.state_query.texture.texture_3d_texture_min_lod_gettexparameteri
dEQP-GLES3.functional.state_query.texture.texture_3d_texture_max_lod_gettexparameteri
dEQP-GLES3.functional.state_query.texture.texture_2d_array_texture_min_lod_gettexparameteri
dEQP-GLES3.functional.state_query.texture.texture_2d_array_texture_max_lod_gettexparameteri
dEQP-GLES3.functional.state_query.texture.texture_cube_map_texture_min_lod_gettexparameteri
dEQP-GLES3.functional.state_query.texture.texture_cube_map_texture_max_lod_gettexparameteri

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

commit | commitdiff | tree

Samuel Iglesias Gonsalvez [Thu, 11 Dec 2014 22:34:13 +0000 (23:34 +0100)]

main: fix return GL_FRAMEBUFFER_ATTACHMENT_TEXTURE_LEVEL value

Return the proper value for two-dimensional array texture and three-dimensional
textures.

From OpenGL ES 3.0 spec, chapter 6.1.13 "Framebuffer Object Queries",
page 234:

"If pname is FRAMEBUFFER_ATTACHMENT_TEXTURE_LAYER and the texture
object named FRAMEBUFFER_ATTACHMENT_OBJECT_NAME is a layer of a
three-dimensional texture or a two-dimensional array texture, then params
will contain the number of the texture layer which contains the attached im-
age. Otherwise params will contain the value zero."

Furthermore, FRAMEBUFFER_ATTACHMENT_TEXTURE_LAYER is an alias of
FRAMEBUFFER_ATTACHMENT_TEXTURE_3D_ZOFFSET_EXT.

This patch fixes dEQP test:

dEQP-GLES3.functional.state_query.fbo.framebuffer_attachment_texture_layer

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

commit | commitdiff | tree

Iago Toral Quiroga [Wed, 17 Dec 2014 13:19:01 +0000 (14:19 +0100)]

i965: Fix bitcast operations with negate (ceil)

Commit 0ae9ca12a8 put source modifiers out of the bitcast operations
by adding a MOV operation that would handle them separately. It missed
the case of ceil though: the implementation negates both its source and
destination operands. The source operand will be used for RNDD, which
we can handle normally, but we need to fix the modifier for the
negated result.

v2:
- RNDD can handle the source modifier so no need to put that one
in a separate MOV.

Fixes the following 42 dEQP tests:
dEQP-GLES3.functional.shaders.builtin_functions.common.ceil.*_vertex
dEQP-GLES3.functional.shaders.builtin_functions.common.ceil.*_fragment
dEQP-GLES3.functional.shaders.builtin_functions.precision.ceil._*vertex.*
dEQP-GLES3.functional.shaders.builtin_functions.precision.ceil._*fragment.*

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

commit | commitdiff | tree

Iago Toral Quiroga [Fri, 12 Dec 2014 14:14:32 +0000 (15:14 +0100)]

mesa: Depth and stencil attachments must be the same in OpenGL ES3

"9.4. FRAMEBUFFER COMPLETENESS
...
Depth and stencil attachments, if present, are the same image."

Notice that this restriction is not included in the OpenGL ES2 spec.

Fixes 18 dEQP tests in:
dEQP-GLES3.functional.fbo.completeness.attachment_combinations.*

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

commit | commitdiff | tree

Eduardo Lima Mitev [Mon, 15 Dec 2014 16:04:52 +0000 (17:04 +0100)]

mesa: Initializes the stencil value masks to 0xFF instead of ~0u

'4.1.4 Stencil Test' section of the GL-ES 3.0 specification says:

    "In the initial state, [...] the front and back stencil mask are both set
    to the value 2^s − 1, where s is greater than or equal to the number of
    bits in the deepest stencil buffer* supported by the GL implementation."

Since the maximum supported precision for stencil buffers is 8 bits, mask
values should be initialized to 2^8 - 1 = 0xFF.

Currently, these masks are initialized to max unsigned integer (~0u), because
in OpenGL 3.0 and before, the initial mask values were:

    "In the initial state, stenciling is disabled, the front and back
    stencil reference value are both zero, the front and back stencil
    comparison functions are both ALWAYS, and the front and back
    stencil mask are both all ones."

The problem is that it causes the mask values to overflow to -1 when converted
to signed integer by glGet* APIs.

Fixes 6 dEQP failing tests:
* dEQP-GLES3.functional.state_query.integers.stencil_value_mask_getfloat
* dEQP-GLES3.functional.state_query.integers.stencil_back_value_mask_getfloat
* dEQP-GLES3.functional.state_query.integers.stencil_value_mask_separate_getfloat
* dEQP-GLES3.functional.state_query.integers.stencil_value_mask_separate_both_getfloat
* dEQP-GLES3.functional.state_query.integers.stencil_back_value_mask_separate_getfloat
* dEQP-GLES3.functional.state_query.integers.stencil_back_value_mask_separate_both_getfloat

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

commit | commitdiff | tree

Eduardo Lima Mitev [Wed, 26 Nov 2014 15:44:18 +0000 (16:44 +0100)]

i965: Sets missing vertex shader constant values for HighInt format

The range's min and max, and the precision value are not set correctly for the
vertex shader constants.

Fixes 1 dEQP test: dEQP-GLES3.functional.state_query.shader.precision_vertex_highp_int

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

commit | commitdiff | tree

Marek Olšák [Mon, 12 Jan 2015 22:13:48 +0000 (23:13 +0100)]

r600g: fix build failure when building the driver without LLVM

RSS Atom