git.libre-soc.org Git - mesa.git/log

Dave Airlie [Wed, 27 May 2015 08:37:17 +0000 (18:37 +1000)]

tgsi: handle indirect sampler arrays. (v2)

This is required for ARB_gpu_shader5 support in softpipe.

v2: add support to txd/txf/txq paths.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Kenneth Graunke [Wed, 10 Jun 2015 07:52:07 +0000 (00:52 -0700)]

nir: Allow vec2/vec3/vec4 instructions in the select peephole pass.

These are basically just moves, so they should be safe as well.

When disabling i965's GLSL IR level scalarizer (channel expressions)
pass, I started seeing NIR code like this:

        if ssa_21 {
                block block_1:
                /* preds: block_0 */
                vec4 ssa_120 = vec4 ssa_82, ssa_83, ssa_84, ssa_30
                /* succs: block_3 */
        } else {
                block block_2:
                /* preds: block_0 */
                /* succs: block_3 */
        }
        block block_3:
        /* preds: block_1 block_2 */
        vec4 ssa_33 = phi block_1: ssa_120, block_2: ssa_2

Previously, the GLSL IR scalarizer pass would break the vec4 into a
series of fmovs, which were allowed by the peephole pass.  But with
the vec4 operation, they were not.  We want to keep getting selects.

Normal i965 on Broadwell:
instructions in affected programs:     200 -> 176 (-12.00%)
helped:                                4

With brw_fs_channel_expressions() disabled:
instructions in affected programs:     1832 -> 1646 (-10.15%)
helped:                                30

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Kenneth Graunke [Fri, 15 May 2015 16:58:42 +0000 (09:58 -0700)]

i965: Add and fix comments in brw_vue_map.c.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>

commit | commitdiff | tree

Kenneth Graunke [Fri, 15 May 2015 16:54:23 +0000 (09:54 -0700)]

i965: Split VUE map handling out of brw_vs.c into brw_vue_map.c.

This was originally only used by the vertex shader, but it's now used by
the geometry shader as well, and will also eventually be used for
tessellation control and evaluation shaders.

I suspect it will be easier to find in a file named after the concept.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>

commit | commitdiff | tree

Ben Widawsky [Thu, 4 Jun 2015 04:35:51 +0000 (21:35 -0700)]

i965/gen9: Implement Push Constant Buffer workaround

This implements a workaround (exact excerpt as a comment in the code). The docs
specify [clearly, after you struggle for a while] that the offset isn't relative
to state base. This actually makes sense. This fixes hangs on SKL.

Buffer #0 is meant to be used for normal uniforms.
Buffer #1 is typically used for gather constants when using RS.
Buffer #1-#3 could be used to push a bunch of UBO data which would just be
somewhere in memory, and not relative to the dynamic state.

NOTE: I've moved away from the ternary operator for the new gen9 conditions.
Admittedly it's probably not great to do this, but I really want to fix this all
up in the subsequent patch and doing it here makes that diff a lot nicer. I want
to split out the gen8/9 code to make the function a bit more readable, but to
keep this easily cherry-pickable I am doing this fix first. If we decide not to
merge the cleanup patch then I can revisit this.

Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Valtteri Rantala <Valtteri.rantala@intel.com>

commit | commitdiff | tree

Brian Paul [Mon, 22 Jun 2015 14:29:49 +0000 (08:29 -0600)]

mesa: use _mesa_lookup_enum_by_nr() in print_array()

Print GL_FLOAT, etc. instead of hex value.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Chia-I Wu [Mon, 22 Jun 2015 06:27:19 +0000 (14:27 +0800)]

ilo: emit 3DPRIMITIVE from gen6_3dprimitive_info

It allows us to remove ilo_ib_state::draw_start_offset and
ILO_PRIM_RECTANGLES. gen6_3d_translate_pipe_prim() is also replaced by
ilo_translate_draw_mode().

commit | commitdiff | tree

Chia-I Wu [Mon, 22 Jun 2015 06:15:52 +0000 (14:15 +0800)]

ilo: align vertex buffer size in buf_create()

With ilo_format.[ch] moved out of core, the aligning of vertex buffers does
not belong to core anymore.

commit | commitdiff | tree

Chia-I Wu [Mon, 22 Jun 2015 06:06:13 +0000 (14:06 +0800)]

ilo: move ilo_format.[ch] out of core

They provide PIPE_FORMAT_x to GEN6_FORMAT_x translation as well as some
convenient helpers. Move them out of core.

commit | commitdiff | tree

Chia-I Wu [Mon, 22 Jun 2015 05:37:05 +0000 (13:37 +0800)]

ilo: add ilo_state_surface_valid_format()

Check if a surface format can be used for the specified access type.

commit | commitdiff | tree

Chia-I Wu [Mon, 22 Jun 2015 05:15:24 +0000 (13:15 +0800)]

ilo: add ilo_state_vf_valid_element_format()

Check if a surface format can be used as a VE format.

commit | commitdiff | tree

Alexandre Courbot [Fri, 17 Oct 2014 06:05:32 +0000 (15:05 +0900)]

nvc0: use NV_VRAM_DOMAIN() macro

Use the newly-introduced NV_VRAM_DOMAIN() macro to support alternative
VRAM domains for chips that do not have dedicated video memory.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Martin Peres <martin.peres@free.fr>

commit | commitdiff | tree

Alexandre Courbot [Fri, 17 Oct 2014 05:58:11 +0000 (14:58 +0900)]

nouveau: support for custom VRAM domains

Some GPUs (e.g. GK20A, GM20B) do not embed VRAM of their own and use
the system memory as a backend instead. For such systems, allocating
objects in VRAM results in errors since the kernel will not allow
VRAM objects allocations.

This patch adds a vram_domain member to struct nouveau_screen that can
optionally be initialized to an alternative domain to use for VRAM
allocations. If left untouched, NOUVEAU_BO_VRAM will be used for
systems that embed VRAM, and NOUVEAU_BO_GART will be used for VRAM-less
systems.

Code that uses GPU objects is then expected to use the NV_VRAM_DOMAIN()
macro in place of NOUVEAU_BO_VRAM to ensure correct behavior on
VRAM-less chips.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Martin Peres <martin.peres@free.fr>

commit | commitdiff | tree

Chia-I Wu [Sat, 20 Jun 2015 15:27:08 +0000 (23:27 +0800)]

ilo: add ilo_state_compute

Replace gen6_idrt_data with ilo_state_compute, which has a bunch of
validations and is now preferred.

commit | commitdiff | tree

Dave Airlie [Mon, 22 Jun 2015 03:36:41 +0000 (13:36 +1000)]

r600g: ignore sampler views for now.

This fixes a regression in that r600 stopped working when
sampler views were pushed.

Signed-off-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Rob Clark [Sat, 13 Jun 2015 13:14:31 +0000 (09:14 -0400)]

freedreno/ir3: pass sz to split_dest()

For query_levels, we generate a getinfo with writemask of (z), which RA
will consider as size==3.  But we were still generating four fanouts.
Which meant that RA would see it as two different register classes,
depending on the path to definer.  Ie. on the getinfo instruction itself
it would see size==3, but when chasing back through the fanouts it would
see size==4.

Easiest way to solve that is to just generate the chain of neighboring
fanouts to have the correct size in the first place.

Note: we may eventually want split_dest() to take start/end or wrmask
instead, since really we only need size==1.  But RA is not clever enough
for that, query_levels is not that common, and the other two registers
that get allocated are never used so those register slots can be
immediately re-used.  So bunch of work for probably no real gain.

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Fri, 12 Jun 2015 18:27:44 +0000 (14:27 -0400)]

freedreno/ir3/nir: add more opcodes

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Mon, 8 Jun 2015 18:45:47 +0000 (14:45 -0400)]

freedreno/ir3: only unminify txf coords on a3xx

Seems like a4xx gets this right.

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Mon, 8 Jun 2015 18:23:49 +0000 (14:23 -0400)]

freedreno: remove int sampler shader variants

We get this information from NIR (which gets it from sview decl in tgsi
when translating from tgsi), so no need to maintain shader variants for
this.

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Tue, 9 Jun 2015 21:17:06 +0000 (17:17 -0400)]

freedreno/ir3: block reshuffling and loops!

This shuffles things around to allow the shader to have multiple basic
blocks.  We drop the entire CFG structure from nir and just preserve the
blocks.  At scheduling we know whether to schedule conditional branches
or unconditional jumps at the end of the block based on the # of block
successors.  (Dropping jumps to the following instruction, etc.)

One slight complication is that variables (load_var/store_var, ie.
arrays) are not in SSA form, so we have to figure out where to put the
phi's ourself.  For this, we use the predecessor set information from
nir_block.  (We could perhaps use NIR's dominance frontier information
to help with this?)

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Mon, 1 Jun 2015 16:35:19 +0000 (12:35 -0400)]

freedreno/ir3: a4xx encodes larger immed offset

Without this, negative branch/jump offsets look like very large positive
offsets.

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Mon, 25 May 2015 14:59:21 +0000 (10:59 -0400)]

freedreno/ir3: simplify find_neighbors stop condition

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Mon, 25 May 2015 14:30:54 +0000 (10:30 -0400)]

freedreno/ir3: move inputs/outputs to shader

These belong in the shader, rather than the block. Mostly a lot of
churn and nothing too interesting. But splitting this out from the
rest of ir3_block reshuffling to cut down the noise in the later
patch.

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Fri, 1 May 2015 16:21:12 +0000 (12:21 -0400)]

freedreno/ir3/ra: use register_allocate

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Sat, 23 May 2015 17:37:41 +0000 (13:37 -0400)]

freedreno/ir3: introduce ir3_compiler object

Right now, just provides a cleaner way to get at the gpu-id, given the
separation between compiler and context. But we will need this also to
hold the reg-set for new register allocation.

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Sat, 25 Apr 2015 20:30:55 +0000 (16:30 -0400)]

freedreno/ir3: dump nocp option

No longer used, or even possible, with NIR frontend.

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Tue, 9 Jun 2015 21:42:16 +0000 (17:42 -0400)]

freedreno/ir3: silence warnings

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Sat, 25 Apr 2015 14:22:49 +0000 (10:22 -0400)]

freedreno/ir3: remove tgsi f/e

Also remove ir3_flatten which was only used by tgsi f/e.

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Thu, 30 Apr 2015 17:57:15 +0000 (13:57 -0400)]

freedreno/ir3/sched: convert to priority queue

Use a more standard priority-queue based scheduling algo. It is simpler
and will make things easier once we have multiple basic blocks and flow
control.

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Thu, 30 Apr 2015 15:38:43 +0000 (11:38 -0400)]

freedreno/ir3: use standard list implementation

Use standard list_head double-linked list and related iterators,
helpers, etc, rather than weird combo of instruction array and next
pointers depending on stage. Now block has an instrs_list. In
certain stages where we want to remove and re-add to the blocks list
we just use list_replace() to copy the list to a new list_head.

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Thu, 30 Apr 2015 14:10:14 +0000 (10:10 -0400)]

freedreno/ir3: drop dot graph dumping

At least for now.. right now the instruction and instruction list
printing should suffice, and the re-working of ir3_block would require
a lot of changes in that code.

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Sat, 25 Apr 2015 15:05:27 +0000 (11:05 -0400)]

freedreno/ir3: more builder helpers

Use ir3_MOV() builder in a couple of spots, rather than open-coding the
instruction construction. Also add ir3_NOP() builder and use that
instead of open coding.

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Thu, 30 Apr 2015 19:20:03 +0000 (15:20 -0400)]

gallium/ttn: add missing SNE

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Rob Clark [Wed, 29 Apr 2015 12:38:45 +0000 (08:38 -0400)]

util/list: add list_first/last_entry

I need an easier way to get at head/tail in ir3.

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Mon, 8 Jun 2015 18:09:09 +0000 (14:09 -0400)]

gallium/ttn: add texture-type support

v2: rebased on using SVIEW to hold type information

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

commit | commitdiff | tree

Rob Clark [Mon, 8 Jun 2015 17:20:30 +0000 (13:20 -0400)]

glsl_to_tgsi: add SVIEW decl support

Freedreno needs sampler type information to deal with int/uint textures.
To accomplish this, start creating sampler-view declarations, as
suggested here:

http://lists.freedesktop.org/archives/mesa-dev/2014-November/071583.html

create a sampler-view with index matching the sampler, to encode the
texture type (ie. SINT/UINT/FLOAT).  Ie:

   DCL SVIEW[n], 2D, UINT
   DCL SAMP[n]
   TEX OUT[1], IN[1], SAMP[n]

For tgsi texture instructions which do not take an explicit SVIEW
argument, the SVIEW index is implied by the SAMP index.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

commit | commitdiff | tree

Rob Clark [Thu, 11 Jun 2015 00:02:55 +0000 (20:02 -0400)]

util/blitter (and friends): generate appropriate SVIEW decls

Some hardware needs to know the sampler type. Update the blit related
shaders to include SVIEW decl.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

commit | commitdiff | tree

Rob Clark [Thu, 11 Jun 2015 00:01:11 +0000 (20:01 -0400)]

util/pstipple: updates for SVIEW decls

To allow for shaders which use SVIEW decls for TEX* instructions, we
need to preserve the constraint that the shader either has no SVIEW's or
it has one matching SVIEW for each SAMP.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

commit | commitdiff | tree

Rob Clark [Wed, 10 Jun 2015 23:59:20 +0000 (19:59 -0400)]

draw: updates to support SVIEW decls

To allow for shaders which use SVIEW decls for TEX* instructions, we
need to preserve the constraint that the shader either has no SVIEW's or
it has one matching SVIEW for each SAMP.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

commit | commitdiff | tree

Rob Clark [Wed, 10 Jun 2015 23:51:32 +0000 (19:51 -0400)]

tgsi/transform: add support for SVIEW decls

TODO single return_type (use enum)

v2: single return_type arg, and use enum

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

commit | commitdiff | tree

Rob Clark [Wed, 10 Jun 2015 23:49:55 +0000 (19:49 -0400)]

tgsi: update docs for SVIEW usage with TEX* instructions

Based on mailing list discussion here:

http://lists.freedesktop.org/archives/mesa-dev/2014-November/071583.html

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

commit | commitdiff | tree

Eric Anholt [Sat, 20 Jun 2015 22:02:50 +0000 (15:02 -0700)]

mesa: Back out an accidental change I had in a VC4 commit.

This was a hack as part of debugging some glamor-on-GLES2 behavior that
ended up being an xserver bug. I suspect we can just flip this extension
on for GLES2, but the spec says it requires 3.1.

commit | commitdiff | tree

Emil Velikov [Sat, 20 Jun 2015 15:40:56 +0000 (16:40 +0100)]

docs: add news item and link release notes for mesa 10.5.8

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

commit | commitdiff | tree

Emil Velikov [Sat, 20 Jun 2015 15:37:16 +0000 (16:37 +0100)]

docs: Add sha256sums for the 10.5.8 release

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit a81b1d5512f64ffca1c13a5937e7eb0de24713ae)

commit | commitdiff | tree

Emil Velikov [Sat, 20 Jun 2015 14:14:45 +0000 (15:14 +0100)]

Add release notes for the 10.5.8 release

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 24b043aab73ce066ded6e4bc93f589008dfc8484)

commit | commitdiff | tree

Eric Anholt [Sat, 20 Jun 2015 02:47:44 +0000 (19:47 -0700)]

vc4: Use a defined t value for 1D textures.

This doesn't fix the broken 1D cases of texsubimage, but it does prevent
segfaulting when dumping the QIR code generated in fbo-1d.

commit | commitdiff | tree

Eric Anholt [Sat, 20 Jun 2015 02:41:25 +0000 (19:41 -0700)]

vc4: Fix write-only texsubimage when we had to align.

We need to make sure that when we store the aligned box, we've got
initialized contents in the border. We could potentially just load the
border area, but for now let's get text rendering working in X (and fix
the GL_TEXTURE_2D errors in piglit's texsubimage test and
gl-2.1-pbo/test_tex_image)

commit | commitdiff | tree

Chia-I Wu [Thu, 18 Jun 2015 14:48:14 +0000 (22:48 +0800)]

ilo: clean up header includes

Core is more self-contained now.

commit | commitdiff | tree

Chia-I Wu [Fri, 19 Jun 2015 16:34:29 +0000 (00:34 +0800)]

ilo: avoid ilo_ib_state in genX_3DPRIMITIVE()

ilo_ib_state is not in core.

commit | commitdiff | tree

Chia-I Wu [Thu, 18 Jun 2015 14:47:20 +0000 (22:47 +0800)]

ilo: move gen6_so_SURFACE_STATE() out of core

It does not belong to core.

commit | commitdiff | tree

Chia-I Wu [Mon, 15 Jun 2015 07:17:45 +0000 (15:17 +0800)]

ilo: add ilo_state_sol_buffer

It serves the same purpose as ilo_state_vertex_buffer does.

commit | commitdiff | tree

Chia-I Wu [Fri, 19 Jun 2015 07:10:02 +0000 (15:10 +0800)]

ilo: add ilo_state_index_buffer

It serves the same purpose as ilo_state_vertex_buffer does.

commit | commitdiff | tree

Chia-I Wu [Fri, 19 Jun 2015 07:06:50 +0000 (15:06 +0800)]

ilo: add ilo_state_vertex_buffer

Being a parameter-like state, we may want to get rid of
ilo_state_vertex_buffer_info or ilo_state_vertex_buffer eventually. But we
want them now as they are how we do cross-validation right now.

commit | commitdiff | tree

Chia-I Wu [Thu, 18 Jun 2015 06:26:29 +0000 (14:26 +0800)]

ilo: add 3DSTATE_VF_INSTANCING to ilo_state_vf

3DSTATE_VF_INSTANCING specifies instancing enable and step rate. They are
specified along with 3DSTATE_VERTEX_BUFFERS instead prior to Gen8. Both
commands are added.

commit | commitdiff | tree

Chia-I Wu [Tue, 16 Jun 2015 15:11:06 +0000 (23:11 +0800)]

ilo: add 3DSTATE_VF to ilo_state_vf

3DSTATE_VF specifies cut index enable and cut index. Cut index enable is
specified in 3DSTATE_INDEX_BUFFER instead prior to Gen7.5. Both commands are
added.

commit | commitdiff | tree

Chia-I Wu [Thu, 18 Jun 2015 05:55:32 +0000 (13:55 +0800)]

ilo: embed pipe_index_buffer in ilo_ib_state

Make it obvious that we save a copy of pipe_index_buffer.

commit | commitdiff | tree

Chia-I Wu [Fri, 19 Jun 2015 15:29:32 +0000 (23:29 +0800)]

ilo: fix a buffer overrun

Add missing parentheses in SURFTYPE_NULL initialization.

commit | commitdiff | tree

Chia-I Wu [Fri, 19 Jun 2015 15:24:17 +0000 (23:24 +0800)]

ilo: fix a -Wmaybe-uninitialized warning

ilo_shader.c: In function ‘ilo_shader_select_kernel_sbe’:
ilo_shader.c:1140:27: warning: ‘src_skip’ may be used uninitialized in this
function [-Wmaybe-uninitialized]

commit | commitdiff | tree

Brian Paul [Fri, 19 Jun 2015 22:45:44 +0000 (16:45 -0600)]

glsl: fix formatting glitch in _mesa_print_ir()

Print the closing ) before the newline. Trivial.

commit | commitdiff | tree

Ben Widawsky [Fri, 19 Jun 2015 01:45:47 +0000 (18:45 -0700)]

i965/gen8: Use HALIGN_16 for single sample mcs buffers

The original code meant to do this, but was only checking num_samples == 1 to
figure out if a surface was fast clear capable. However, we can allocate single
sample miptrees with num_samples == 0 (when it's an internally created buffer).

This fixes a bunch of the piglit tests on gen8. Other gens should have been
fine.

Here is the order of events that allowed this to slip through:
t0: I wrote halign patches and tested them. These alignment assertions are for
   gen8 fast clear surfaces, basically.
t1: I pushed bogus perf patch which made fast clears never happen
t2: Reworked halign patches based on Chad's feedback and introduced the bug this
   patch fixes.
t2.5: I tested reworked patches, but assertion wasn't hit because of t1.
t3. Matt fixed issue in t1 which made fast clears happen here:
commit 22af95af8316f2888a3935cdf774ff0997b3dd42
Author: Matt Turner <mattst88@gmail.com>
Date:   Thu Jun 18 16:14:50 2015 -0700

    i965: Add missing braces around if-statement.

This logic should match that of the v1 of my halign patch series.

Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Matt Turner <mattst88@gmail.com>
Reported-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Mark Janes <mark.a.janes@intel.com>

commit | commitdiff | tree

Ilia Mirkin [Fri, 19 Jun 2015 16:08:24 +0000 (12:08 -0400)]

mesa: move ARB_gs5 enums to core, EXT_polygon_offset_clamp to desktop

When adding EXT_polygon_offset_clamp, I first made it core-only, and
never moved the enum getter back to the GL/GL_CORE section. Similarly,
ARB_gs5 is a core-only extension, so move its getters to the GL_CORE
section.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Brian Paul [Fri, 19 Jun 2015 00:03:29 +0000 (18:03 -0600)]

u_vbuf: fix src_offset alignment in u_vbuf_create_vertex_elements()

If the driver says PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY=1,
the driver should never receive a pipe_vertex_element::src_offset value
that's not a multiple of four. But the vbuf code wasn't actually adjusting
the src_offset value when creating the vertex element state object.

We just need to align the src_offset values put in the driver_attribs[]
array.

See the piglit gl-1.5-vertex-buffer-offsets test.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Brian Paul [Thu, 18 Jun 2015 23:53:42 +0000 (17:53 -0600)]

gallium: whitespace, formatting clean-up in p_state.h

Remove trailing whitespace, move some braces, 78-column wrapping.
Trivial.

commit | commitdiff | tree

Brian Paul [Tue, 16 Jun 2015 21:32:46 +0000 (15:32 -0600)]

st/wgl: fix WGL_SWAP_METHOD_ARB query

There are three possible return values (not two): WGL_SWAP_COPY_ARB,
WGL_SWAP_EXCHANGE_EXT and WGL_SWAP_UNDEFINED_ARB.

VMware bug 1431184

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>

commit | commitdiff | tree

Brian Paul [Tue, 16 Jun 2015 21:32:46 +0000 (15:32 -0600)]

stw: use new stw_get_nop_function() function to avoid Viewperf 12 crashes

Also, print a warning if we do return NULL from wglGetProcAddress() to
help spot this sort of problem in the future.

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit | commitdiff | tree

Brian Paul [Tue, 16 Jun 2015 21:32:46 +0000 (15:32 -0600)]

stw: add some no-op functions for GL_EXT_dsa, GL_NV_half_float

Viewperf 12 calls wglGetProcAddress() to get pointers to some unsupported
DSA and half-float functions. We return NULL but Viewperf doesn't check
for null before trying to jump through the pointer. That causes a crash.

This patch adds no-op functions to call instead (used by the next patch).
This avoids the crash but the rendering is incorrect.

Some DSA functions are being added to Mesa at this time so we may be
able to remove some of these no-ops in the future.

More no-op functions may be added as needed.

VMware PR1383421

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit | commitdiff | tree

Jose Fonseca [Tue, 16 Jun 2015 21:32:46 +0000 (15:32 -0600)]

st/wgl: Don't return core profile for 3.1 contexts.

WGL_CONTEXT_PROFILE_MASK_ARB doesn't apply to desktop OpenGL versions
less than 3.2 -- applications can't specify whether they want a core or
a compat 3.1 context -- instead they are supposed the check whether the
returned context advertises GL_ARB_compatibility extension.

Mesa doesn't support compatability contexts for version higher than 3.1,
so we used to return core profile context, but this makes several Windows
applications unhappy, because they just assume they got a compatability
context without checking.

So it seems safer to on Windows to never return core profile for 3.1,
ie, just fail the context creation.

VMware PR1365920.

Reviewed-by: Brian Paul <brianp@vmware.com>

commit | commitdiff | tree

Brian Paul [Tue, 16 Jun 2015 21:32:46 +0000 (15:32 -0600)]

st/wgl: set PIPE_BIND_SAMPLER_VIEW for window color buffers

To allow sampling from the surface for things like glCopyPixels
or glCopyTexSubImage.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>

commit | commitdiff | tree

Brian Paul [Tue, 16 Jun 2015 21:32:45 +0000 (15:32 -0600)]

st/wgl: add support for multisample pixel formats

Create pixel formats with 0, 4, 8 and 16 samples per pixel.
Add a SVGA_FORCE_MSAA env var to force creating all pixel formats
with a particular sample count. This is useful for testing Mesa/GLUT/
etc. programs which don't ordinarily use multisample.

Reviewed-by: Matthew McClure <mcclurem@vmware.com>

commit | commitdiff | tree

Brian Paul [Tue, 16 Jun 2015 21:32:45 +0000 (15:32 -0600)]

st/wgl: respect sample count when creating framebuffer surfaces

Use the visual/pixel format's sample count instead of zero.

Reviewed-by: Matthew McClure <mcclurem@vmware.com>

commit | commitdiff | tree

Brian Paul [Tue, 16 Jun 2015 21:32:45 +0000 (15:32 -0600)]

st/wgl: fix WGL_SAMPLE_BUFFERS_ARB query

Only report 1 for WGL_SAMPLE_BUFFERS_ARB if the number of samples
per pixel > 1.

Reviewed-by: Matthew McClure <mcclurem@vmware.com>

commit | commitdiff | tree

Brian Paul [Sat, 13 Jun 2015 14:07:08 +0000 (08:07 -0600)]

tgsi: add comments for ureg_emit_label()

commit | commitdiff | tree

Brian Paul [Sat, 13 Jun 2015 13:58:53 +0000 (07:58 -0600)]

tgsi: new comments, assertion for executing TGSI_OPCODE_CAL

commit | commitdiff | tree

Timothy Arceri [Fri, 19 Jun 2015 03:03:36 +0000 (13:03 +1000)]

docs: update developer info

Update piglit link to the current Piglit website.

Add note about updating patchwork when sending patch revisions.

Acked-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Jose Fonseca [Thu, 18 Jun 2015 14:47:00 +0000 (15:47 +0100)]

llvmpipe: Truncate the binned constants to max const buffer size.

Tested with Ilia Mirkin's gzdoom.trace and
"arb_uniform_buffer_object-maxuniformblocksize fsexceed" piglit test
without my earlier fix to fail linkage when UBO exceeds
GL_MAX_UNIFORM_BLOCK_SIZE.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>

commit | commitdiff | tree

Jose Fonseca [Mon, 15 Jun 2015 17:29:02 +0000 (18:29 +0100)]

glsl: Fail linkage when UBO exceeds GL_MAX_UNIFORM_BLOCK_SIZE.

It's not totally clear whether other Mesa drivers can safely cope with
over-sized UBOs, but at least for llvmpipe receiving a UBO larger than
its limit causes problems, as it won't fit into its internal display
lists.

This fixes piglit "arb_uniform_buffer_object-maxuniformblocksize
fsexceed" without regressions for llvmpipe.

NVIDIA driver also fails to link the shader from
"arb_uniform_buffer_object-maxuniformblocksize fsexceed".

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65525

PS: I don't recommend cherry-picking this for Mesa stable, as some app
might inadvertently been relying on UBOs larger than
GL_MAX_UNIFORM_BLOCK_SIZE to work on other drivers, so even if this
commit is universally accepted it's probably best to let it mature in
master for a while.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>

commit | commitdiff | tree

Ilia Mirkin [Thu, 18 Jun 2015 23:08:24 +0000 (19:08 -0400)]

glsl: guard gl_NumSamples enablement on ARB_sample_shading

gl_NumSamples should only be enabled when ARB_sample_shading is enabled.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>

commit | commitdiff | tree

Matt Turner [Thu, 18 Jun 2015 23:14:50 +0000 (16:14 -0700)]

i965: Add missing braces around if-statement.

Fixes a performance problem caused by commit b639ed2f.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90895

commit | commitdiff | tree

Jordan Justen [Tue, 16 Jun 2015 21:27:15 +0000 (14:27 -0700)]

i965/compute: Fix undefined code with right_mask for SIMD32

Although we don't support SIMD32, krh pointed out that the left shift
by 32 is undefined by C/C++ for 32-bit integers.

Suggested-by: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Ilia Mirkin [Thu, 18 Jun 2015 03:00:44 +0000 (23:00 -0400)]

mesa: add GL_PROGRAM_PIPELINE support in KHR_debug calls

This was apparently missed when ARB_sso support was added.
Add label support to pipeline objects just like all the other
debug-related objects.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>

commit | commitdiff | tree

Ilia Mirkin [Wed, 17 Jun 2015 19:09:26 +0000 (15:09 -0400)]

glsl: add version checks to conditionals for builtin variable enablement

A number of builtin variables have checks based on the extension being
enabled, but were missing enablement via a higher GLSL version.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>

commit | commitdiff | tree

Ilia Mirkin [Wed, 17 Jun 2015 19:07:14 +0000 (15:07 -0400)]

glsl: handle conversions to double when comparing param matches

This allows mod(int, int) to become selected as float mod when doubles
are supported.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>

commit | commitdiff | tree

Emil Velikov [Thu, 18 Jun 2015 11:59:28 +0000 (12:59 +0100)]

ilo: remove missing ilo_fence.h from the sources list

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

commit | commitdiff | tree

Boyan Ding [Tue, 16 Jun 2015 03:08:33 +0000 (11:08 +0800)]

egl/x11: Set version of swrastLoader to 2

which it actually implements instead of the newest version defined in
dri_interface.h

Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

commit | commitdiff | tree

Eric Anholt [Wed, 17 Jun 2015 20:24:06 +0000 (13:24 -0700)]

vc4: Move tile state/alloc allocation into the kernel.

This avoids a security issue where userspace could have written the tile
state/tile alloc behind the GPU's back, and will apparently be necessary
for fixing stability bugs (tile state buffers are missing some top bits
for the tile alloc's address).

commit | commitdiff | tree

Eric Anholt [Wed, 10 Jun 2015 19:36:47 +0000 (12:36 -0700)]

vc4: Move RCL generation into the kernel.

There weren't that many variations of RCL generation, and this lets us
skip all the in-kernel validation for what we generated.

commit | commitdiff | tree

Eric Anholt [Wed, 17 Jun 2015 20:51:55 +0000 (13:51 -0700)]

vc4: Add dumping of VC4_PACKET_TILE_BINNING_MODE_CONFIG.

commit | commitdiff | tree

Eric Anholt [Thu, 18 Jun 2015 06:49:19 +0000 (23:49 -0700)]

vc4: Fix memory leak from simple_list conversion.

I accidentally shadowed the outside declaration, so we always returned
NULL even when we'd found something in the cache.

commit | commitdiff | tree

Eric Anholt [Thu, 18 Jun 2015 05:56:15 +0000 (22:56 -0700)]

vc4: Track the number of BOs allocated and their size.

This is useful for BO leak debugging.

commit | commitdiff | tree

Iago Toral Quiroga [Tue, 24 Feb 2015 18:02:50 +0000 (19:02 +0100)]

i965: Fix textureGrad with cube samplers

We can't use sampler messages with gradient information (like
sample_g or sample_d) to deal with this scenario because according
to the PRM:

"The r coordinate and its gradients are required only for surface
types that use the third coordinate. Usage of this message type on
cube surfaces assumes that the u, v, and gradients have already been
transformed onto the appropriate face, but still in [-1,+1] range.
The r coordinate contains the faceid, and the r gradients are ignored
by hardware."

Instead, we should lower this to compute the LOD manually based on the
gradients and use a different sample message that takes the computed
LOD instead of the gradients. This is already being done in
brw_lower_texture_gradients.cpp, but it is restricted to shadow
samplers only, although there is a comment stating that we should
probably do this also for samplerCube and samplerCubeArray.

Because of this, both dEQP and Piglit test cases for textureGrad with
cube maps currently fail.

This patch does two things:
1) Activates the texturegrad lowering pass for all cube samplers.
2) Corrects the computation of the LOD value for cube samplers.

I had to do 2) because for cube maps the calculations implemented
in the lowering pass always compute a value of rho that is twice
the value we want (so we get a LOD value one unit larger than we
want). This only happens for cube map samplers (all kinds). I am
not sure about why we need to do this, but I suspect that it is
related to the fact that cube map coordinates, when transported
to a specific face in the cube, are in the range [-1, 1] instead of
[0, 1] so we probably need to divide the derivatives by 2 when
we compute the LOD. Doing that would produce the same result as
dividing the final rho computation by 2 (or removing a unit
from the computed LOD, which is what we are doing here).

Fixes the following piglit tests:
bin/tex-miplevel-selection textureGrad Cube -auto -fbo
bin/tex-miplevel-selection textureGrad CubeArray -auto -fbo
bin/tex-miplevel-selection textureGrad CubeShadow -auto -fbo

Fixes 10 dEQP tests in the following category:
dEQP-GLES3.functional.shaders.texture_functions.texturegrad.*cube*

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>

commit | commitdiff | tree

Ilia Mirkin [Thu, 18 Jun 2015 02:18:09 +0000 (22:18 -0400)]

nvc0/ir: can't have a join on a load with an indirect source

Triggers an INVALID_OPCODE warning on GK208. Seems rare enough to not
warrant verification on other chips. Fixes the new piglits:

ubo_array_indexing/fs-nonuniform-control-flow.shader_test
ubo_array_indexing/vs-nonuniform-control-flow.shader_test

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.5 10.6" <mesa-stable@lists.freedesktop.org>

commit | commitdiff | tree

Kevin Rogovin [Wed, 17 Jun 2015 10:29:59 +0000 (13:29 +0300)]

docs: mark GL_ARB_framebuffer_no_attachments done for i965

Mark GL_ARB_framebuffer_no_attachments as done for i965.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>

commit | commitdiff | tree

Kevin Rogovin [Wed, 17 Jun 2015 10:29:58 +0000 (13:29 +0300)]

i965: enable ARB_framebuffer_no_attachments for Gen7+

Enable GL_ARB_framebuffer_no_attachments in i965 for Gen7 and higher.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>

commit | commitdiff | tree

Kevin Rogovin [Wed, 17 Jun 2015 10:29:57 +0000 (13:29 +0300)]

i965: execution of frag-shader when it has atomic buffer

Ensure that the GPU spawns the fragment shader thread for those
fragment shaders with atomic buffer access.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>

commit | commitdiff | tree

Kevin Rogovin [Wed, 17 Jun 2015 10:29:56 +0000 (13:29 +0300)]

mesa: function for testing if current frag-shader has atomics

Add helper function that checks if current fragment shader active
of gl_context has atomic buffer access.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>

commit | commitdiff | tree

Kevin Rogovin [Wed, 17 Jun 2015 10:29:55 +0000 (13:29 +0300)]

i965: Use _mesa_geometric_ functions appropriately

Change references to gl_framebuffer::Width, Height, MaxNumLayers
and Visual::samples to use the _mesa_geometry_ convenience functions
for those places where the geometry of the gl_framebuffer is needed
(in contrast to the geometry of the intersection of the attachments
of the gl_framebuffer).

This patch is to pave the way to enable GL_ARB_framebuffer_no_attachments
on Gen7 and higher in i965.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>

commit | commitdiff | tree

Kevin Rogovin [Wed, 17 Jun 2015 10:29:54 +0000 (13:29 +0300)]

mesa: helper function for scissor box of gl_framebuffer

Add helper convenience function that intersects the scissor values
against a passed bounding box. In addition, to avoid replicated code,
make the function _mesa_scissor_bounding_box() use this new function.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>

commit | commitdiff | tree

Kevin Rogovin [Wed, 17 Jun 2015 10:29:53 +0000 (13:29 +0300)]

mesa: add helper functions for geometry of gl_framebuffer

Add convenience helper functions for fetching geometry of gl_framebuffer
that return the geometry of the gl_framebuffer instead of the geometry of
the buffers of the gl_framebuffer when then the gl_framebuffer has no
attachments.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>

commit | commitdiff | tree

Kevin Rogovin [Wed, 17 Jun 2015 10:29:52 +0000 (13:29 +0300)]

PATCH 03/10] mesa: Complete ARB_framebuffer_no_attachments in Mesa core

Implement GL_ARB_framebuffer_no_attachments in Mesa core
- changes to conditions for framebuffer completenss
- implement set/get functions for framebuffers for
new functions in GL_ARB_framebuffer_no_attachments

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>

commit | commitdiff | tree

Kevin Rogovin [Wed, 17 Jun 2015 10:29:51 +0000 (13:29 +0300)]

mesa: Constants and functions for ARB_framebuffer_no_attachments

Define the enumeration constants, function entry points and
glGet for the GL_ARB_framebuffer_no_attachments.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>

RSS Atom