mesa.git
10 years agoi965/cfg: Embed exec_node in bblock_link.
Matt Turner [Mon, 12 May 2014 21:40:40 +0000 (14:40 -0700)]
i965/cfg: Embed exec_node in bblock_link.

In order to remove bblock_link's inheritance of exec_node. Also makes
linked list walk code much nicer.

Acked-by: Eric Anholt <eric@anholt.net>
10 years agoi965/cfg: Make brw_cfg.h closer to C-includable.
Matt Turner [Mon, 12 May 2014 16:54:15 +0000 (09:54 -0700)]
i965/cfg: Make brw_cfg.h closer to C-includable.

Only bblock_link's inheritance left.

Acked-by: Eric Anholt <eric@anholt.net>
10 years agoi965/cfg: Protect brw_cfg.h from multiple inclusion.
Matt Turner [Wed, 19 Feb 2014 22:47:57 +0000 (14:47 -0800)]
i965/cfg: Protect brw_cfg.h from multiple inclusion.

Acked-by: Eric Anholt <eric@anholt.net>
10 years agoglsl: Add C-callable fprint_ir function.
Matt Turner [Tue, 13 May 2014 01:16:22 +0000 (18:16 -0700)]
glsl: Add C-callable fprint_ir function.

Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965/fb: Use meta path for stencil up/downsampling
Topi Pohjolainen [Mon, 12 May 2014 09:42:28 +0000 (12:42 +0300)]
i965/fb: Use meta path for stencil up/downsampling

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
10 years agoi965/meta: Stencil blit for miptree updownsampling
Topi Pohjolainen [Mon, 12 May 2014 09:35:40 +0000 (12:35 +0300)]
i965/meta: Stencil blit for miptree updownsampling

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965/fb: Use meta path for stencil blits
Topi Pohjolainen [Sat, 19 Apr 2014 14:11:10 +0000 (17:11 +0300)]
i965/fb: Use meta path for stencil blits

This is effective only on gen8 for now as previous generations still
go through blorp.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965/meta: Stencil blits
Topi Pohjolainen [Mon, 5 May 2014 19:18:46 +0000 (22:18 +0300)]
i965/meta: Stencil blits

v2: Create the intel renderbuffer with level hardcoded to zero instead
    of overriding it in the surface state configuration. Also moved the
    dimension adjustments for tiling, mip level, msaa into the render
    buffer creation. Finally prepares for another blit path needed for
    miptree updownsampling.
v3 (Ken): Dropped unnecessary memory context for "ralloc_asprintf()"

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
10 years agoi965: Extend brw_get_rb_for_first_slice() for specified level/layer
Topi Pohjolainen [Fri, 18 Apr 2014 23:02:42 +0000 (02:02 +0300)]
i965: Extend brw_get_rb_for_first_slice() for specified level/layer

v2: Configure stencil directly for final dimensions instead of
    adjusting bit by bit for tiling, mip level and msaa.
v3 (Ken): Used non-static constant for horizontal alignment

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965/gen8: Surface state overriding for stencil
Topi Pohjolainen [Wed, 7 May 2014 09:16:28 +0000 (12:16 +0300)]
i965/gen8: Surface state overriding for stencil

v2: Allow hardware to offset accesses to individual layers. Also leave
    the mip-level overriding for the creator of the intel renderbuffer
    to handle. Merged with "i965/gen8: Allow stencil buffers to be
    configured as single sampled"

Ken: I left the "_mesa_problem()" still in place. I think it is clearer
     to remove it in a separate patch.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965/wm: Surface state overrides for configuring w-tiled as y-tiled
Topi Pohjolainen [Wed, 7 May 2014 07:49:50 +0000 (10:49 +0300)]
i965/wm: Surface state overrides for configuring w-tiled as y-tiled

v2: Use intel_mipmap_tree::total_width in order to get correct alignment
    automatically. Also use "mt->total_height / mt->physical_depth0" as
    surface height allowing hardware to offset to correct slice.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965 meta up/downsample: Fix renderbuffer _BaseFormat
Jordan Justen [Thu, 15 May 2014 06:06:47 +0000 (06:06 +0000)]
i965 meta up/downsample: Fix renderbuffer _BaseFormat

mt->format is of type mesa_format, and therefore can't be
used with _mesa_base_fbo_format which requires a GLenum input.

On gen8, this fixes various piglit fbo-depthstencil tests with
samples > 1.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
10 years agoi965: Delete current_insn() function.
Matt Turner [Mon, 5 May 2014 21:08:56 +0000 (14:08 -0700)]
i965: Delete current_insn() function.

10 years agoi965: Remove blorp unit tests.
Matt Turner [Wed, 14 May 2014 22:15:02 +0000 (15:15 -0700)]
i965: Remove blorp unit tests.

They've served their purpose (in transitioning blorp to using
fs_generator) and now they just necessitate large amounts of manual
labor to regenerate if the disassembler changes.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoegl-static: include libradeonwinsys.la only once
Emil Velikov [Sat, 10 May 2014 14:59:03 +0000 (15:59 +0100)]
egl-static: include libradeonwinsys.la only once

With this and the previous patch, we no longer have multiple
definitions in the final egl_gallium.so.

v2: Drop duplicate libloader link.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chia-I Wu <olv@lunarg.com> (v1)
Reviewed-by: Tom Stellard <thomas.stellard@amd.com> (v1)
10 years agogallium/radeon: link in libradeon.la at target level
Emil Velikov [Sat, 10 May 2014 13:35:08 +0000 (14:35 +0100)]
gallium/radeon: link in libradeon.la at target level

It makes more sense to link the core and common parts of the driver as the
target is build. Additionally this will help us drop duplicating symbols
for targets that static link mulitple pipe-drivers. Only egl-static needs
that currently with more to come.

To simplify things a bit add HAVE_GALLIUM_RADEON_COMMON variable.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
10 years agogallium/radeon: build only a single common library libradeon
Emil Velikov [Sat, 10 May 2014 13:25:08 +0000 (14:25 +0100)]
gallium/radeon: build only a single common library libradeon

Just fold libllvmradeon in libradeon.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
10 years agofreedreno/a3xx: fix write to bogus register
Rob Clark [Wed, 14 May 2014 16:46:42 +0000 (12:46 -0400)]
freedreno/a3xx: fix write to bogus register

The loops for updating the multiple packed fields in SP_VS_OUT[] and
SP_VS_VPC_DST[] will zero out one register beyond the last that on
required.  Which is normally not a problem (and is kinda convenient
when looking at cmdstream dumps) unless we have maximum (16) varyings.

Fix loop termination condition so that this does not happen.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx: account for special inputs/outputs
Rob Clark [Wed, 14 May 2014 15:39:44 +0000 (11:39 -0400)]
freedreno/a3xx: account for special inputs/outputs

We need to size input/output tables big enough for special inputs/
outputs (gl_Position, gl_FrontFacing, etc) which, while they don't
count towards the hw limit of 16 attributes or 16 varyings, we do
still need to track them all the same.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx: fix MAX_INPUTS shader cap
Rob Clark [Wed, 14 May 2014 15:15:26 +0000 (11:15 -0400)]
freedreno/a3xx: fix MAX_INPUTS shader cap

Hardware only supports 16.  Which fd3_shader_variant properly reflected,
but the pipe cap did not, leading to array overflow (and shaders that
could not possibly work).

Also a bunch of asserts to make problems like this easier to see.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx: add debug flag to expose glsl130
Rob Clark [Wed, 14 May 2014 15:06:21 +0000 (11:06 -0400)]
freedreno/a3xx: add debug flag to expose glsl130

We are starting to add integer support to the compiler, which does not
get exercised with glsl feature level 120 and without advertising
integer support.  But doing so breaks too many things right now.  So
for now use a debug flag to conditionally expose the functionality
while it is in development.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx/compiler: add KILL_IF
Ryan Houdek [Wed, 14 May 2014 02:58:03 +0000 (21:58 -0500)]
freedreno/a3xx/compiler: add KILL_IF

The KILL_IF opcode could potentially be merged in to the regular KILL
opcode function.  It was a pain to do so, so I've left is separated
for cleanliness.

Signed-off-by: Ryan Houdek <Sonicadvance1@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx/compiler: start adding integer support
Ryan Houdek [Wed, 14 May 2014 02:44:41 +0000 (21:44 -0500)]
freedreno/a3xx/compiler: start adding integer support

Adds a large sum of TGSI opcodes to the a3xx compiler.

For integer opcodes we have 28 opcodes added.
Adds 4 floating point compare opcodes

If GLSL 1.30 is enabled, this allows the GLSL 1.30 piglits to have a
completion amount of 432/641.

Signed-off-by: Ryan Houdek <Sonicadvance1@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agodraw: better llvm names for shaders for debugging.
Roland Scheidegger [Tue, 13 May 2014 01:43:11 +0000 (03:43 +0200)]
draw: better llvm names for shaders for debugging.

All shaders had the same name.
We could probably use some identifier per shader too, but for now only use
the variant number.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
10 years agollvmpipe: improve setup shader names (for debugging)
Roland Scheidegger [Tue, 13 May 2014 00:59:30 +0000 (02:59 +0200)]
llvmpipe: improve setup shader names (for debugging)

The setup shaders were composed of both a fs shader number and a variant
number. But since they aren't tied to a particular fragment shader, the
former was a fixed zero while the latter was also always zero because
it was never assigned. So, similar to what the fs code does, use a ever
increasing number to give it a more catchy name (unlike fragment shaders
though where this number is for each explicitly created shader, we just use
it for the implicitly created variants).
And while here, fix whitespace a bit.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
10 years agollvmpipe: kill off llvmpipe_variant_count
Roland Scheidegger [Tue, 13 May 2014 00:20:32 +0000 (02:20 +0200)]
llvmpipe: kill off llvmpipe_variant_count

Unused except it was increased for both fs and setup shader variants created.
Probably some leftover from ages ago.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
10 years agomesa/st: fix number of ubos being declared in a shader
Roland Scheidegger [Wed, 14 May 2014 19:06:23 +0000 (21:06 +0200)]
mesa/st: fix number of ubos being declared in a shader

Previously the code used the total number of ubos being declared in the
linked program (so the ubos of all shaders combined), use the number
from the particular shader instead.
This fixes an assertion failure with piglit arb_uniform_buffer_object-maxblocks
seen in llvmpipe since 8a9f5ecdb116d0449d63f7b94efbfa8b205d826f as it now emits
code for each declared buffer, not just the ones actually used.

CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agonvc0: enable support for maxwell boards
Ben Skeggs [Fri, 9 May 2014 05:56:08 +0000 (15:56 +1000)]
nvc0: enable support for maxwell boards

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agonvc0: add maxwell (sm50) compiler backend
Ben Skeggs [Fri, 9 May 2014 05:56:05 +0000 (15:56 +1000)]
nvc0: add maxwell (sm50) compiler backend

The big missing part here is proper sched data calculations, but
hopefully the chosen placeholder will be sufficient for now.

Passes piglit as well as GK107 does.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agonvc0: maxwell isa has no per-instruction join modifier
Ben Skeggs [Fri, 9 May 2014 05:56:03 +0000 (15:56 +1000)]
nvc0: maxwell isa has no per-instruction join modifier

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agonvc0: replace immd 0 with $rLASTGPR for emit/restart opcodes
Ben Skeggs [Fri, 9 May 2014 05:56:01 +0000 (15:56 +1000)]
nvc0: replace immd 0 with $rLASTGPR for emit/restart opcodes

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agonvc0: move nvc0 lowering pass class definitions into header
Ben Skeggs [Fri, 9 May 2014 05:55:59 +0000 (15:55 +1000)]
nvc0: move nvc0 lowering pass class definitions into header

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agonvc0: bump sched data member to 32-bits
Ben Skeggs [Fri, 9 May 2014 05:55:57 +0000 (15:55 +1000)]
nvc0: bump sched data member to 32-bits

SM50 backend requires 21 bits per instruction, not 8.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agonvc0: use vertex arrays for eng3d blit
Ben Skeggs [Fri, 9 May 2014 05:55:55 +0000 (15:55 +1000)]
nvc0: use vertex arrays for eng3d blit

Maxwell doesn't have immediate-mode.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agonvc0: restrict "constant vbo" logic to fermi/kepler classes
Ben Skeggs [Fri, 9 May 2014 05:55:53 +0000 (15:55 +1000)]
nvc0: restrict "constant vbo" logic to fermi/kepler classes

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agonvc0: replace some vb->stride checks with constant_vbo instead
Ben Skeggs [Fri, 9 May 2014 05:55:51 +0000 (15:55 +1000)]
nvc0: replace some vb->stride checks with constant_vbo instead

Maxwell no longer has the methods to set constant attributes, and we'll
want to be treating stride 0 vtxbufs the same as for stride > 0.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agonvc0: add maxwell class
Ben Skeggs [Fri, 9 May 2014 05:55:49 +0000 (15:55 +1000)]
nvc0: add maxwell class

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agonvc0: allow for easier modification of compiler library routines
Ben Skeggs [Fri, 9 May 2014 05:55:47 +0000 (15:55 +1000)]
nvc0: allow for easier modification of compiler library routines

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agonvc0: properly distribute macros in source form
Ben Skeggs [Fri, 9 May 2014 05:55:44 +0000 (15:55 +1000)]
nvc0: properly distribute macros in source form

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agodocs: Add a note about llvm-shared-libs and libxatracker
Emil Velikov [Mon, 5 May 2014 21:09:23 +0000 (22:09 +0100)]
docs: Add a note about llvm-shared-libs and libxatracker

Both changes landed in 10.2, and for people not following the
development cycle these will come as a surprise. Note that the
pipe_* interface is not stable.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Rob Clark <robclark@freedesktop.org>
10 years agoautomake: Honor GL_LIB for gallium libgl-xlib
Brad King [Tue, 6 May 2014 15:06:47 +0000 (11:06 -0400)]
automake: Honor GL_LIB for gallium libgl-xlib

Use "@GL_LIB@" in src/gallium/targets/libgl-xlib/Makefile.am to produce
the library name specified by the configure --with-gl-lib-name option.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
10 years agoconfigure: correctly set LD_NO_UNDEFINED
Emil Velikov [Tue, 13 May 2014 00:33:48 +0000 (01:33 +0100)]
configure: correctly set LD_NO_UNDEFINED

Commit 11623be934f85 was meant to have this hunk, which
I accidently dropped during git rebase.

Cc: 10.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Julien Cristau <jcristau@debian.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jonathan Gray <jsg@jsg.id.au>
10 years agogallivm: only fetch pointers to constant buffers once
Roland Scheidegger [Wed, 14 May 2014 13:43:53 +0000 (15:43 +0200)]
gallivm: only fetch pointers to constant buffers once

In 1d35f77228ad540a551a8e09e062b764a6e31f5e support for multiple constant
buffers was introduced. This meant we had another indirection, and we did
resolve the indirection for each constant buffer access. This looks very
reasonable since llvm can figure out if it's the same pointer, however it
turns out that this can cause llvm compilation time to go through the roof
and beyond (I've seen cases in excess of factor 100, e.g. from 50 ms to more
than 10 seconds (!)), with all the additional time spent in IR optimization
passes (and in the end all of it in DominatorTree::dominate()).
I've been unable to narrow it down a bit more (only some shaders seem affected,
seemingly without much correlation to overall shader complexity or constant
usage) but it is easily avoidable by doing the buffer lookups themeselves just
once (at constant buffer declaration time).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
10 years agogallivm: fix output stream flushing in error case for disassembly.
Roland Scheidegger [Wed, 14 May 2014 01:23:09 +0000 (03:23 +0200)]
gallivm: fix output stream flushing in error case for disassembly.

When there's an error, also need to flush the stream, otherwise an assertion
is hit (meaning you don't actually see the error neither).

10 years agoradeonsi: Fix anisotropic filtering state setup
Michel Dänzer [Wed, 14 May 2014 07:30:33 +0000 (16:30 +0900)]
radeonsi: Fix anisotropic filtering state setup

Bring it back in line with r600g. I broke this in the original radeonsi
bringup. :(

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78537

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agotgsi: support parsing texture offsets from text tgsi shaders
Ilia Mirkin [Thu, 8 May 2014 01:15:12 +0000 (21:15 -0400)]
tgsi: support parsing texture offsets from text tgsi shaders

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agomesa/st: provide native integers implementation of ir_unop_any
Ilia Mirkin [Thu, 8 May 2014 13:06:36 +0000 (09:06 -0400)]
mesa/st: provide native integers implementation of ir_unop_any

Previously, ir_unop_any was implemented via a dot-product call, which
uses floating point multiplication and addition. The multiplication was
completely pointless, and the addition can just as well be done with an
or. Since we know that the inputs are booleans, they must already be in
canonical 0/~0 format, and the final SNE can also be avoided.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agogallium/docs: clarify when query results are reset
Rob Clark [Tue, 13 May 2014 02:19:03 +0000 (22:19 -0400)]
gallium/docs: clarify when query results are reset

It wasn't completely clear from the docs, so I had to figure out by
looking at piglit results.  Hopefully this saves the next driver writer
implementing queries some time.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agogallivm: Remove lp_func_delete_body.
José Fonseca [Mon, 12 May 2014 15:12:32 +0000 (16:12 +0100)]
gallivm: Remove lp_func_delete_body.

Not necessary, now that we will free the whole module (hence all
function bodies) immediately after compiling.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agogallivm: Remove gallivm_free_function.
José Fonseca [Mon, 12 May 2014 14:59:55 +0000 (15:59 +0100)]
gallivm: Remove gallivm_free_function.

Unused.  Deprecated by gallivm_free_ir().

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agollvmpipe: Delete unneeded LLVM stuff earlier.
José Fonseca [Mon, 12 May 2014 14:59:13 +0000 (15:59 +0100)]
llvmpipe: Delete unneeded LLVM stuff earlier.

Same as Frank's change to draw module but for llvmpipe module.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agodraw: Delete unneeded LLVM stuff earlier.
Frank Henigman [Tue, 1 Oct 2013 19:15:43 +0000 (15:15 -0400)]
draw: Delete unneeded LLVM stuff earlier.

Free up unneeded LLVM stuff immediately after generating vertex shader
code.  Saves about 500K per shader.

v2: Don't bother calling gallivm_free_function (Jose)

Signed-off-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agogallivm: Separate freeing LLVM intermediate data from freeing final code.
Frank Henigman [Tue, 1 Oct 2013 19:15:42 +0000 (15:15 -0400)]
gallivm: Separate freeing LLVM intermediate data from freeing final code.

Split free_gallivm_state() into two steps.  First step is
gallivm_free_ir() which cleans up the LLVM scaffolding used to generate
code while preserving the code itself.  Second step is
gallivm_free_code() to free the memory occupied by the code.

v2: s/gallivm_teardown/gallivm_free_ir/ (Jose)

Signed-off-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agogallivm: One code memory pool with deferred free.
Frank Henigman [Tue, 1 Oct 2013 19:15:41 +0000 (15:15 -0400)]
gallivm: One code memory pool with deferred free.

Provide a JITMemoryManager derivative which puts all generated code into
one memory pool instead of creating a new one each time code is generated.
This saves significant memory per shader as the pool size is 512K and
a small shader occupies just several K.

This memory manager also defers freeing generated code until you tell
it to do so, making it possible to destroy the LLVM engine while keeping
the code, thus enabling future memory savings.

v2: Fix compilation errors with LLVM 3.4 (Jose)

Signed-off-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agogallivm: Run passes per module, not per function.
José Fonseca [Mon, 12 May 2014 13:29:04 +0000 (14:29 +0100)]
gallivm: Run passes per module, not per function.

This is how it is meant to be done nowadays.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agogallivm: Use LLVM global context.
José Fonseca [Thu, 8 May 2014 14:21:49 +0000 (15:21 +0100)]
gallivm: Use LLVM global context.

I saw that LLVM internally uses its global context for some things, even
when we use our own.  Given ours is also global, might as well use
LLVM's.

However, sepearate contexts can still be enabled with a simple source
code modification, for when the need/benefit arises.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agogallivm: Stop using module providers.
José Fonseca [Mon, 12 May 2014 13:03:47 +0000 (14:03 +0100)]
gallivm: Stop using module providers.

Nowadays LLVMModuleProviderRef is just an alias for LLVMModuleRef, so
its use just causes unnecessary confusion.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agogallivm,draw,llvmpipe: Remove support for versions of LLVM prior to 3.1.
José Fonseca [Thu, 8 May 2014 12:25:28 +0000 (13:25 +0100)]
gallivm,draw,llvmpipe: Remove support for versions of LLVM prior to 3.1.

Older versions haven't been tested probably don't work anyway.  But more
importantly, code supporting it is hindering further work.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agoconfigure: Require LLVM 3.1.
José Fonseca [Mon, 12 May 2014 15:30:51 +0000 (16:30 +0100)]
configure: Require LLVM 3.1.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agoscons: Require LLVM 3.1
José Fonseca [Thu, 8 May 2014 12:32:07 +0000 (13:32 +0100)]
scons: Require LLVM 3.1

Support for prior versions will be removed in the following change.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agoi965: Reformat brw_set_src1 so it can be easily found with grep.
Matt Turner [Fri, 2 May 2014 21:49:24 +0000 (14:49 -0700)]
i965: Reformat brw_set_src1 so it can be easily found with grep.

10 years agoi965: fix size assert for gen7 in brw_init_compaction_tables()
Samuel Iglesias Gonsalvez [Thu, 8 May 2014 13:55:08 +0000 (15:55 +0200)]
i965: fix size assert for gen7 in brw_init_compaction_tables()

It should compare with it's own size.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
10 years agoi965: Relax accumulator dependency scheduling on Gen < 6
Iago Toral Quiroga [Wed, 7 May 2014 07:58:43 +0000 (09:58 +0200)]
i965: Relax accumulator dependency scheduling on Gen < 6

Many instructions implicitly update the accumulator on Gen < 6. The instruction
scheduling code just calls add_barrier_deps() for each accumulator access on
these platforms, but a large class of operations don't actually update the
accumulator -- mostly move and logical instructions. Teaching the scheduling
code about this would allow more flexibility to schedule instructions.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77740
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglsl: simplify the M_PI*f macros, fixes build on OpenBSD
Jonathan Gray [Wed, 14 May 2014 05:13:55 +0000 (15:13 +1000)]
glsl: simplify the M_PI*f macros, fixes build on OpenBSD

The M_PI*f macros used a preprocessor paste to append 'f'
to M_PI defines, which works if the values are only numbers
but breaks on OpenBSD where M_PI definitions have casts
and brackets to meet requirements of a future version of POSIX,

http://austingroupbugs.net/view.php?id=801
http://austingroupbugs.net/view.php?id=828

Simplify the M_PI*f macros by using casts directly in the defines
as suggested by Kenneth Graunke.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78665
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
10 years agodocs: Really add the 10.1.3 release nots this time
Carl Worth [Wed, 14 May 2014 00:30:17 +0000 (17:30 -0700)]
docs: Really add the 10.1.3 release nots this time

Commit a96c3bccf6791359d1159ebe9475e0ed5cf790ed intended to add these, but I
forgot to add the file.

10 years agofreedreno/a3xx: occlusion query support
Rob Clark [Sun, 11 May 2014 18:15:32 +0000 (14:15 -0400)]
freedreno/a3xx: occlusion query support

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno: add support for hw queries
Rob Clark [Sat, 10 May 2014 17:45:54 +0000 (13:45 -0400)]
freedreno: add support for hw queries

Real GPU queries need some infrastructure to track samples per tile and
accumulate the results.  But fortunately this can be shared across GPU
generation.

See:
https://github.com/freedreno/freedreno/wiki/Queries#hardware-queries

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/query: allow multiple query implementations
Rob Clark [Fri, 9 May 2014 21:33:19 +0000 (17:33 -0400)]
freedreno/query: allow multiple query implementations

Split out fd_query into an abstract base class, to allow multiple
implementations.  The current sw based queries are moved into
fd_sw_query.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agomesa: Dump ARB_vp/fp source and IR when MESA_GLSL=dump.
Kenneth Graunke [Mon, 12 May 2014 03:22:48 +0000 (20:22 -0700)]
mesa: Dump ARB_vp/fp source and IR when MESA_GLSL=dump.

As far as I can tell, Mesa hasn't had a convenient way to dump ARB_vp/fp
source until now.  Using MESA_GLSL=dump is convenient, since it means
you can use a single environment variable to dump a program's shaders,
no matter which language they're written in.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoi965: Don't _swrast_BlitFramebuffer when doing CopyTexSubImage.
Kenneth Graunke [Mon, 12 May 2014 00:20:08 +0000 (17:20 -0700)]
i965: Don't _swrast_BlitFramebuffer when doing CopyTexSubImage.

The point of copytexsubimage_using_blit_framebuffer is to use a hardware
accelerated BlitFramebuffer path.  If that fails, we shouldn't do a
swrast blit---we should try our CTSI fallback code.

This is especially important for i965 and GLES, where we don't even
create a swrast context.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77705
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
10 years agoi965/gen8: Set depth extent field
Jordan Justen [Tue, 13 May 2014 18:06:59 +0000 (18:06 +0000)]
i965/gen8: Set depth extent field

The depth extent field is used to limit the allowed slice range that
can be rendered to.

With the previous setting, only slice 0 could be rendered.

This fixes piglit amd_vertex_shader_layer-layered-depth-texture-render.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/gen8 depth: Set depth size based on LOD0 for 3D textures
Jordan Justen [Sat, 10 May 2014 21:48:47 +0000 (14:48 -0700)]
i965/gen8 depth: Set depth size based on LOD0 for 3D textures

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/gen7 depth: Set depth size based on LOD0 for 3D textures
Jordan Justen [Sat, 10 May 2014 21:48:47 +0000 (14:48 -0700)]
i965/gen7 depth: Set depth size based on LOD0 for 3D textures

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/gen8 renderbuffer: Set depth size based on LOD0 for 3D textures
Jordan Justen [Sat, 10 May 2014 21:48:47 +0000 (14:48 -0700)]
i965/gen8 renderbuffer: Set depth size based on LOD0 for 3D textures

Fixes piglit's
'gl-3.2-layered-rendering-clear-color-all-types 3d mipmapped'

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/gen7 renderbuffer: Set depth size based on LOD0 for 3D textures
Jordan Justen [Sat, 10 May 2014 21:48:47 +0000 (14:48 -0700)]
i965/gen7 renderbuffer: Set depth size based on LOD0 for 3D textures

If blorp is disabled for color clears, then piglit's
'gl-3.2-layered-rendering-clear-color-all-types 3d mipmapped'
will fail.

Currently, gen8 fails similarly on this test because gen8
does not use blorp.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agofreedreno/a3xx: add point-size
Rob Clark [Sun, 11 May 2014 15:57:20 +0000 (11:57 -0400)]
freedreno/a3xx: add point-size

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno: update generated headers
Rob Clark [Sun, 11 May 2014 15:51:41 +0000 (11:51 -0400)]
freedreno: update generated headers

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agoglsl_to_tgsi: remove unnecessary dead code elimination pass
Bryan Cain [Tue, 6 May 2014 03:23:38 +0000 (22:23 -0500)]
glsl_to_tgsi: remove unnecessary dead code elimination pass

With the more advanced dead code elimination pass already being run,
eliminate_dead_code was making no difference in instruction count, and had
an undesirable O(n^2) runtime. So remove it and rename
eliminate_dead_code_advanced to eliminate_dead_code.

Reviewed-by: Marek Olšák <marek.olsak at amd.com>
10 years agoralloc: Omit detailed license information about talloc.
José Fonseca [Fri, 9 May 2014 09:55:17 +0000 (10:55 +0100)]
ralloc: Omit detailed license information about talloc.

That information misleads source code auditing tools to think that
ralloc itself is released under LGPL v3.

Instead, simply state talloc is not licensed under a permissive license.

v2: Use wording suggested by Kenneth.

Reviewed-by: Brian Paul <brianp@vmware.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Avoid redundant call to brw_merge_inputs() in brw_try_draw_prims()
Iago Toral Quiroga [Thu, 8 May 2014 11:29:20 +0000 (13:29 +0200)]
i965: Avoid redundant call to brw_merge_inputs() in brw_try_draw_prims()

We always call brw_merge_inputs() right before looping over the primitives but
this can be called inside the loop for each primitive too. In the case we do it
for the first primitive the call is redundant and can be skipped.

Reviewed-by: Eric Anholt <eric@anholt.net>
10 years agoglsl: Do not call lhs->variable_referenced() multiple times
Iago Toral Quiroga [Fri, 9 May 2014 06:50:03 +0000 (08:50 +0200)]
glsl: Do not call lhs->variable_referenced() multiple times

Instead take the result from the first call and use it where needed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agometa: Refactor state save/restore for framebuffer texture blits
Topi Pohjolainen [Wed, 7 May 2014 05:42:29 +0000 (08:42 +0300)]
meta: Refactor state save/restore for framebuffer texture blits

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agowayland: Move version 2 request to end of interface specification
Kristian Høgsberg [Mon, 12 May 2014 22:40:11 +0000 (15:40 -0700)]
wayland: Move version 2 request to end of interface specification

We're moving towards requiring interface additions to be appended to the
end of the interface block.  No functional change, opcodes are assigned as
before, but version 2 additions are now grouped together, which prevents
a scanner warning.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
10 years agoglsl: the number of samplers is already calculated so use it
Timothy Arceri [Sun, 11 May 2014 12:11:21 +0000 (22:11 +1000)]
glsl: the number of samplers is already calculated so use it

Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoi965: Stop doing remapping of "special" regs.
Eric Anholt [Tue, 6 May 2014 22:19:55 +0000 (15:19 -0700)]
i965: Stop doing remapping of "special" regs.

Now that we aren't using pixel_[xy] in live variables, nothing is looking
at these regs after the visitor stage.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Generalize the pixel_x/y workaround for all UW types.
Eric Anholt [Tue, 6 May 2014 20:22:10 +0000 (13:22 -0700)]
i965: Generalize the pixel_x/y workaround for all UW types.

This is the only case where a fs_reg in brw_fs_visitor is used during
optimization/code generation, and it meant that optimizations had to be
careful to not move pixel_x/y's register number without updating it.

Additionally, it turns out we had a couple of other UW values that weren't
getting this treatment (like gl_SampleID), so this more general fix is
probably a good idea (though I wasn't able to replicate problems with
either pixel_[xy]'s values or gl_SampleID, even when telling the register
allocator to reuse registers immediately)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Move has_hiz from the slice to the level.
Eric Anholt [Wed, 23 Apr 2014 21:21:21 +0000 (14:21 -0700)]
i965: Move has_hiz from the slice to the level.

The value depends only on the level, so no need to store the bool per slice.
Shrinks intel_mipmap_slice from 24 bytes to 16, while slotting into an
existing hole in intel_mipmap_level.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
10 years agometa: Refactor configuration of renderbuffer sampling
Topi Pohjolainen [Mon, 5 May 2014 19:49:47 +0000 (22:49 +0300)]
meta: Refactor configuration of renderbuffer sampling

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agometa: Refactor binding of renderbuffer as texture image
Topi Pohjolainen [Mon, 5 May 2014 14:37:40 +0000 (17:37 +0300)]
meta: Refactor binding of renderbuffer as texture image

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agometa: Merge compiling and linking of blit program
Topi Pohjolainen [Thu, 17 Apr 2014 23:21:13 +0000 (02:21 +0300)]
meta: Merge compiling and linking of blit program

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965/blorp: Expose coordinate scissoring and mirroring
Topi Pohjolainen [Tue, 22 Apr 2014 19:11:27 +0000 (22:11 +0300)]
i965/blorp: Expose coordinate scissoring and mirroring

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965/gen8: Use helper variables for surface parameters
Topi Pohjolainen [Wed, 7 May 2014 09:12:02 +0000 (12:12 +0300)]
i965/gen8: Use helper variables for surface parameters

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agonv50,nvc0: fix blit 3d path for 1d array textures
Ilia Mirkin [Sat, 10 May 2014 21:51:21 +0000 (17:51 -0400)]
nv50,nvc0: fix blit 3d path for 1d array textures

Need to adjust coordinates since the shader receives the array index as
depth in z, but the TEX instruction expects it to be the second
coordinate for a 1D array texture. This fixes fbo-generatemipmap-array.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
10 years agonv50,nvc0: leave queries on during blit, turn them on for 2d engine
Ilia Mirkin [Sat, 3 May 2014 07:00:07 +0000 (03:00 -0400)]
nv50,nvc0: leave queries on during blit, turn them on for 2d engine

Fixes the new logic of the conditional rendering piglit test.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
10 years agomesa/st: leave current query enabled during glBlitFramebuffer
Ilia Mirkin [Sat, 10 May 2014 14:25:29 +0000 (10:25 -0400)]
mesa/st: leave current query enabled during glBlitFramebuffer

Also make sure that pipe_blit_info gets zero'd out so that query isn't
accidentally left enabled.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
10 years agogallium: add bit to pipe_blit_info to leave current query enabled
Ilia Mirkin [Sat, 10 May 2014 14:22:17 +0000 (10:22 -0400)]
gallium: add bit to pipe_blit_info to leave current query enabled

Previously the implication was that queries should be disabled during
blits. However glBlitFramebuffer() is supposed to obey the current
query, and this new bit will indicate that to the driver.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
10 years agonv50: fix setting of texture ms info to be per-stage
Ilia Mirkin [Sat, 10 May 2014 03:13:38 +0000 (23:13 -0400)]
nv50: fix setting of texture ms info to be per-stage

Different textures may be bound to each slot for each stage. So we need
to be able to upload ms parameters for each one without stages
overwriting each other.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
10 years agonv50/ir: make sure to reverse cond codes on all the OP_SET variants
Ilia Mirkin [Sat, 10 May 2014 19:02:36 +0000 (15:02 -0400)]
nv50/ir: make sure to reverse cond codes on all the OP_SET variants

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Cc: "10.2 10.1" <mesa-stable@lists.freedesktop.org>
10 years agofreedreno/a2xx: fix compiler warning
Rob Clark [Sun, 11 May 2014 12:57:38 +0000 (08:57 -0400)]
freedreno/a2xx: fix compiler warning

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agoradeonsi: prepare depth export registers at compile time
Marek Olšák [Tue, 6 May 2014 17:59:53 +0000 (19:59 +0200)]
radeonsi: prepare depth export registers at compile time

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>