mesa.git
10 years agoFix build of appleglx
Jon TURNEY [Mon, 12 May 2014 09:47:07 +0000 (10:47 +0100)]
Fix build of appleglx

Define GLX_USE_APPLEGL, as config/darwin used to, to turn on specific code to
use the applegl direct renderer

Convert src/glx/apple/Makefile to automake

Since the applegl libGL is now built by linking libappleglx into libGL, rather
than by linking selected files into a special libGL:

- Remove duplicate code in apple/glxreply.c and apple/apple_glx.c.  This makes
apple/glxreply.c empty, so remove it

- Some indirect rendering code is already guarded by !GLX_USE_APPLEGL, but we
need to add those guards to indirect_glx.c, indirect_init.c (via it's
generator), render2.c and vertarr.c so they don't generate anything

Fix and update various includes

glapi_gentable.c (which is only used on darwin), should be included in shared
glapi as well, to provide _glapi_create_table_from_handle()

Note that neither swrast nor indirect is supported in the APPLEGL path at the
moment, which makes things more complex than they need to be.  More untangling
is needed to allow that

v2: Correct apple/Makefile.am for srcdir != builddir

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
10 years agoMake DRI dependencies and build depend on the target
Jon TURNEY [Mon, 12 May 2014 09:17:06 +0000 (10:17 +0100)]
Make DRI dependencies and build depend on the target

- Don't require xcb-dri[23] etc. if we aren't building for a target with DRM, as
we won't be using dri[23]

- Enable a more fine-grained control of what DRI code is built, so that a libGL
using direct swrast can be built on targets which don't have DRM.

The HAVE_DRI automake conditional is retired in favour of a number of other
conditionals:

HAVE_DRI2 enables building of code using the DRI2 interface (and possibly DRI3
with HAVE_DRI3)

HAVE_DRISW enables building of DRI swrast

HAVE_DRICOMMON enables building of target-independent DRI code, and also enables
some makefile cases where a more detailled decision is made at a lower level.

HAVE_APPLEDRI enables building of an Apple-specific direct rendering interface,
still which requires additional fixing up to build properly.

v2:
Place xfont.c and drisw_glx.c into correct categories.
Update 'make check' as well

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
10 years agoFix build for darwin
Jon TURNEY [Sun, 11 May 2014 13:38:52 +0000 (14:38 +0100)]
Fix build for darwin

Fix build for darwin, when ./configured --disable-driglx-direct

- darwin ld doesn't support -Bsymbolic or --version-script, so check if ld
supports those options before using them
- define GLX_ALIAS_UNSUPPORTED as config/darwin used to, as aliasing of non-weak
symbols isn't supported
- default to -with-dri-drivers=swrast

v2:
Use -Wl,-Bsymbolic, as before, not -Bsymbolic
Test that ld --version-script works, rather than just looking for it in ld --help
Don't use -Wl,--no-undefined on darwin, either

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
10 years agotargets/egl-static: add missing line break in ldflags
Emil Velikov [Sun, 18 May 2014 07:07:24 +0000 (08:07 +0100)]
targets/egl-static: add missing line break in ldflags

Accidently omitted by commit 7b7944ee1cedeaf.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>
10 years agomesa: Fix unbinding GL_DEPTH_STENCIL_ATTACHMENT
James Legg [Fri, 23 May 2014 11:25:37 +0000 (12:25 +0100)]
mesa: Fix unbinding GL_DEPTH_STENCIL_ATTACHMENT

glFramebufferRender(..., GL_DEPTH_STENCIL_ATTACHMENT, ..., 0) only
detached the depth buffer and not the stencil buffer.

Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=79115
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
10 years agotargets/osmesa: limit the amount of exported symbols
Emil Velikov [Wed, 21 May 2014 00:07:00 +0000 (18:07 -0600)]
targets/osmesa: limit the amount of exported symbols

src/gallium/targets/osmesa/Makefile.am |  1 +
src/gallium/targets/osmesa/osmesa.sym  | 18 ++++++++++++++++++
2 files changed, 19 insertions(+)
create mode 100644 src/gallium/targets/osmesa/osmesa.sym

10 years agogallivm: Disable workaround for PR12833 on LLVM 3.2+.
José Fonseca [Wed, 14 May 2014 11:55:50 +0000 (12:55 +0100)]
gallivm: Disable workaround for PR12833 on LLVM 3.2+.

Fixed upstream.

10 years agogallivm: Support MCJIT on Windows.
José Fonseca [Wed, 14 May 2014 11:20:14 +0000 (12:20 +0100)]
gallivm: Support MCJIT on Windows.

It works fine, though it requires using ELF objects.

With this change there is nothing preventing us to switch exclusively
to MCJIT, everywhere.  It's still off though.

10 years agomesa/x86: Fix build with clang 3.4.
José Fonseca [Fri, 23 May 2014 10:36:58 +0000 (11:36 +0100)]
mesa/x86: Fix build with clang 3.4.

It defines bit_SSE41 instead of bit_SSE4_1.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=79095

Trivial.

10 years agomesa: Move declaration to top of block.
José Fonseca [Fri, 23 May 2014 10:23:52 +0000 (11:23 +0100)]
mesa: Move declaration to top of block.

To fix MSVC build.  Trivial.

10 years agometa blit: Set Z texcoord during meta blit to sample the correct layer
Jordan Justen [Wed, 21 May 2014 22:34:26 +0000 (22:34 +0000)]
meta blit: Set Z texcoord during meta blit to sample the correct layer

If the source renderbuffer has a depth > 0, then send a Z texcoord
which is set to the source attachment Z offset.

This fixes piglit's gl-3.2-layered-rendering-gl-layer-render with the
GL_TEXTURE_2D_MULTISAMPLE_ARRAY case test on i965/gen8.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
10 years agoi965: Listen to BRW_NEW_FRAGMENT_PROGRAM for 3DSTATE_PS_BLEND.
Kenneth Graunke [Tue, 20 May 2014 21:52:40 +0000 (14:52 -0700)]
i965: Listen to BRW_NEW_FRAGMENT_PROGRAM for 3DSTATE_PS_BLEND.

brw_color_buffer_write_enabled depends on brw->fragment_program, which
means we have to listen to BRW_NEW_FRAGMENT_PROGRAM.

On most generations, this was only called from a function that already
subscribed.  However, on Broadwell, we failed to listen to the necessary
event in the atom that emits 3DSTATE_PS_BLEND.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
10 years agoi965: Use WE_all for FB write header setup on Broadwell.
Kenneth Graunke [Tue, 20 May 2014 21:52:39 +0000 (14:52 -0700)]
i965: Use WE_all for FB write header setup on Broadwell.

I forgot to disable writemasking on the OR and MOV which set the render
target index and "source 0 alpha present to render target" bit.

Using get_element_ud is equivalent and avoids a line-wrap.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
10 years agomesa/x86: fix a typos in SSE4.1 detection
Tobias Klausmann [Fri, 23 May 2014 01:02:16 +0000 (03:02 +0200)]
mesa/x86: fix a typos in SSE4.1 detection

Commit a2fb71e23 introduced 32-bit code for SSE4.1. Fix compilation, and
make sure to check ecx for the SSE4.1 bit.

[imirkin: switch sse4.1 to look at ecx]
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agomesa: Rely on USE_X86_64_ASM.
José Fonseca [Thu, 22 May 2014 19:43:55 +0000 (20:43 +0100)]
mesa: Rely on USE_X86_64_ASM.

This fixes MinGW x64 builds.  We don't use assembly on any of the
Windows builds, to avoid divergence between MSVC and MinGW when testing.

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoscons: Fix x86_64 build.
José Fonseca [Thu, 22 May 2014 19:24:44 +0000 (20:24 +0100)]
scons: Fix x86_64 build.

x86/common_x86.c is required also for x86_64 builds.

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agodocs: Import 10.1.4 release notes, add news item.
Carl Worth [Tue, 20 May 2014 22:31:34 +0000 (15:31 -0700)]
docs: Import 10.1.4 release notes, add news item.

10 years agomesa/x86: Brown bag fix for undeclared variable.
Matt Turner [Thu, 22 May 2014 18:02:18 +0000 (11:02 -0700)]
mesa/x86: Brown bag fix for undeclared variable.

10 years agoi965: Use SSE4.1 runtime detection for intel_miptree_map.
Matt Atwood [Fri, 2 May 2014 16:44:45 +0000 (09:44 -0700)]
i965: Use SSE4.1 runtime detection for intel_miptree_map.

Previous it was a compile-time decision.

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agomesa/x86: add SSE4.1 runtime detection.
Matt Atwood [Fri, 2 May 2014 16:44:44 +0000 (09:44 -0700)]
mesa/x86: add SSE4.1 runtime detection.

Add a bit to _mesa_x86_features for SSE 4.1, along with macros to query.

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agomesa/x86: Support SSE 4.1 detection on x86-64.
Matt Turner [Fri, 2 May 2014 19:10:17 +0000 (12:10 -0700)]
mesa/x86: Support SSE 4.1 detection on x86-64.

Uses the cpuid.h header provided by gcc and clang. Other platforms are
encouraged to switch.

10 years agomesa: Add uninitialized_vars macro from the Linux kernel.
Matt Turner [Fri, 2 May 2014 19:10:16 +0000 (12:10 -0700)]
mesa: Add uninitialized_vars macro from the Linux kernel.

10 years agoconfigure.ac: Do not enable -Wl,--no-undefined on Mac OS X.
Vinson Lee [Thu, 22 May 2014 05:13:13 +0000 (22:13 -0700)]
configure.ac: Do not enable -Wl,--no-undefined on Mac OS X.

This patch fixes this build error on Mac OS X.

  CCLD     libglapi.la
clang: warning: argument unused during compilation: '-pthread'
clang: warning: argument unused during compilation: '-pthread'
ld: unknown option: --no-undefined
clang: error: linker command failed with exit code 1 (use -v to see invocation)

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
10 years agohaiku: Add missing u_memory.h for FREE()
Alexander von Gluck IV [Wed, 21 May 2014 00:20:58 +0000 (19:20 -0500)]
haiku: Add missing u_memory.h for FREE()

Acked-by: Brian Paul <brianp@vmware.com>
10 years agoconfigure.ac: Remove -fstack-protector-strong from LLVM flags.
Vinson Lee [Sat, 10 May 2014 01:21:59 +0000 (18:21 -0700)]
configure.ac: Remove -fstack-protector-strong from LLVM flags.

-fstack-protector-strong is not supported by clang.

This patch fixes this build error on Fedora 20 with clang.

  CXX      gallivm/lp_bld_debug.lo
clang: error: unknown argument: '-fstack-protector-strong'

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75010
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
10 years agofreedreno/a3xx: fix blend opcode
Rob Clark [Wed, 21 May 2014 20:51:12 +0000 (16:51 -0400)]
freedreno/a3xx: fix blend opcode

Seems the opcodes are slightly different from a2xx.  Resync headers and
move blend_func() helper into hw generation specific code.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agomesa: check constant before null check
Timothy Arceri [Wed, 21 May 2014 11:26:16 +0000 (21:26 +1000)]
mesa: check constant before null check

For most drivers this if statement is always going to fail so check the constant value first.

Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agofreedreno/a3xx: fix depth/stencil gmem restore
Rob Clark [Wed, 21 May 2014 19:41:25 +0000 (15:41 -0400)]
freedreno/a3xx: fix depth/stencil gmem restore

We already multiply by bytes per pixel for this, so f3ba7611 broke
mem2gmem for depth/stencil.  Drop the now-redundant mutiply by cpp.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agoi965: Ask the VBO module to actually use VBOs.
Eric Anholt [Fri, 4 Oct 2013 01:52:10 +0000 (18:52 -0700)]
i965: Ask the VBO module to actually use VBOs.

Note that this covers the Begin/End rendering path, but not user vertex
arrays (so we can't drop copy_array_to_vbo_array() code).  Improves
performance of isosurf GLVERTEX|TRIANGLES by 16.7506% +/- 4.98934%
(n=20). No difference on openarena (n=10), which was why this was reverted
back in cbde2765804a4fc62bcf092230a01376aedbf2cd.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agofreedreno/a3xx: fix depth/stencil GMEM positioning
Rob Clark [Tue, 20 May 2014 18:02:18 +0000 (14:02 -0400)]
freedreno/a3xx: fix depth/stencil GMEM positioning

In cases where there was no color buf bound, there were inconsistancies
in register settings related to position of depth/stencil inside GMEM.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno: update generated headers
Rob Clark [Tue, 20 May 2014 22:49:09 +0000 (18:49 -0400)]
freedreno: update generated headers

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno: use OUT_RELOCW when buffer is written
Rob Clark [Wed, 21 May 2014 13:24:20 +0000 (09:24 -0400)]
freedreno: use OUT_RELOCW when buffer is written

These aren't buffers we ever read back from CPU, so using incorrect
reloc fxn wasn't really harming anything.  But might as well be correct.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agorbug: add missing pipe->blit() entrypoint
Rob Clark [Wed, 21 May 2014 12:41:06 +0000 (08:41 -0400)]
rbug: add missing pipe->blit() entrypoint

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
10 years agometa: Use gl_FragColor to output color values to all the draw buffers
Anuj Phogat [Mon, 19 May 2014 18:55:01 +0000 (11:55 -0700)]
meta: Use gl_FragColor to output color values to all the draw buffers

_mesa_meta_setup_blit_shader() currently generates a fragment shader
which, irrespective of the number of draw buffers, writes the color
to only one 'out' variable. Current shader rely on an undefined
behavior and possibly works by chance.

From OpenGL 4.0  spec, page 256:
  "If a fragment shader writes to gl_FragColor, DrawBuffers specifies a
   set of draw buffers into which the single fragment color defined by
   gl_FragColor is written. If a fragment shader writes to gl_FragData,
   or a user-defined varying out variable, DrawBuffers specifies a set
   of draw buffers into which each of the multiple output colors defined
   by these variables are separately written. If a fragment shader writes
   to none of gl_FragColor, gl_FragData, nor any user defined varying out
   variables, the values of the fragment colors following shader execution
   are undefined, and may differ for each fragment color."

OpenGL 4.4 spec, page 463, added an additional line in this section:
  "If some, but not all user-defined output variables are written, the
   values of fragment colors corresponding to unwritten variables are
   similarly undefined."

V2: Write color output to gl_FragColor instead of writing to multiple
    'out' variables. This'll avoid recompiling the shader every time
    draw buffers count is updated.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agometa: Refactor _mesa_meta_setup_blit_shader() to avoid duplicate shader code
Anuj Phogat [Mon, 19 May 2014 18:47:46 +0000 (11:47 -0700)]
meta: Refactor _mesa_meta_setup_blit_shader() to avoid duplicate shader code

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agotgsi: add GS_INVOCATIONS to property names array
Ilia Mirkin [Tue, 20 May 2014 03:54:40 +0000 (23:54 -0400)]
tgsi: add GS_INVOCATIONS to property names array

In commit 4be146b1, I neglected to add the new property to the strings
array. This leads to the string '(null)' to be printed instead when
converting a GS shader to text.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agonv50,nvc0: fix 3d blits with mipmap levels
Ilia Mirkin [Sun, 18 May 2014 02:48:58 +0000 (22:48 -0400)]
nv50,nvc0: fix 3d blits with mipmap levels

Make sure to normalize the z coordinates as well as the x/y ones when
there are mipmaps present. Fixes 3d mipmap generation, which now uses
the blit path.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
10 years agonv50/ir: fix constant folding for OP_MUL subop HIGH
Ilia Mirkin [Thu, 15 May 2014 03:22:32 +0000 (23:22 -0400)]
nv50/ir: fix constant folding for OP_MUL subop HIGH

These instructions can come in either through IMUL_HI/UMUL_HI TGSI
opcodes, or from OP_DIV constant folding.

Also make sure that the constant foldings which delete the original
instruction still get counted as having done something.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
10 years agonv50/ir: fix s32 x s32 -> high s32 multiply logic
Ilia Mirkin [Thu, 15 May 2014 03:30:16 +0000 (23:30 -0400)]
nv50/ir: fix s32 x s32 -> high s32 multiply logic

Retrieving the high 32 bits of a signed multiply is rather annoying. It
appears that the simplest way to do this is to compute the absolute
value of the arguments, and perform a u32 x u32 -> u64 operation. If the
arguments' signs differ, then negate the result. Since there is no u64
support in the cvt instruction, we have the perform the 2's complement
negation "by hand".

This logic can come into use by the IMUL_HI instruction (very unlikely
to be seen), as well as from constant folding of division by a constant.
Fixes dolphin's divisions by 255.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
10 years agoi965/fs: Assume fragment color clamping is off when precompiling.
Kenneth Graunke [Sun, 26 Jan 2014 03:22:56 +0000 (19:22 -0800)]
i965/fs: Assume fragment color clamping is off when precompiling.

Modern applications frequencly use both UNORM buffers and FLOAT buffers
with color clamping disabled.  (FLOAT with clamping explicitly enabled
and SNORM buffers appear to be less common.)  We don't need to emit
saturates in the fragment shader in either of the common cases.

Mesa sets ctx->Color._ClampFragmentColor to false if all the color
buffers are UNORM.  Also, for GL_FIXED_ONLY mode (the default in
legacy OpenGL), it will be false if any FLOAT buffers are bound.
Since the common case is false, that should be our default.

Thanks to Roland Scheidegger for pointing out some faulty logic
in v1 of this patch (unnecessary code and incorrect explanations).

v2: Drop superfluous code and reword commit message.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
10 years agoegl: Add EGL_CHROMIUM_sync_control extension.
Sarah Sharp [Tue, 6 May 2014 19:10:57 +0000 (12:10 -0700)]
egl: Add EGL_CHROMIUM_sync_control extension.

Chromium defined a new GL extension (that isn't registered with Khronos).
We need to add an EGL extension for it, so we can migrate ChromeOS on
Intel systems to use EGL instead of GLX.

http://git.chromium.org/gitweb/?p=chromium/src/third_party/khronos.git;a=commitdiff;h=27cbfdab35c601f70aa150581ad1448d0401f447

The EGL_CHROMIUM_sync_control extension is similar to the GLX extension
OML_sync_control, but only defines one function,
eglGetSyncValuesCHROMIUM, which is equivalent to glXGetSyncValuesOML.

http://www.opengl.org/registry/specs/OML/glx_sync_control.txt

Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: Jamey Sharp <jamey@minilop.net>
Cc: Ian Romanick <idr@freedesktop.org>
Cc: Stéphane Marchesin <stephane.marchesin@gmail.com>
10 years agoImport eglextchromium.h from Chromium.
Sarah Sharp [Tue, 6 May 2014 19:10:56 +0000 (12:10 -0700)]
Import eglextchromium.h from Chromium.

In order to support the (currently unregistered) Chromium-specific EGL
extension eglGetSyncValuesCHROMIUM on Intel systems, we need to import
the Chromium header that defines it.  The file was downloaded from

https://chromium.googlesource.com/chromium/chromium/+/trunk/ui/gl/EGL/eglextchromium.h

It is subject to the license found at

https://chromium.googlesource.com/chromium/chromium/+/trunk/LICENSE

I have imported the header file and added the license text to the top.
The only change was to fix the include guard on the Chromium header to
change the last line from a #define to a #endif, which makes the header
actually compile.

Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: Jamey Sharp <jamey@minilop.net>
Cc: Ian Romanick <idr@freedesktop.org>
Cc: Stéphane Marchesin <stephane.marchesin@gmail.com>
10 years agodarwin: Fix test for kCGLPFAOpenGLProfile support at runtime
Jeremy Huddleston Sequoia [Tue, 20 May 2014 17:53:00 +0000 (10:53 -0700)]
darwin: Fix test for kCGLPFAOpenGLProfile support at runtime

Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
10 years agofreedreno: don't advertise texture arrays for now
Rob Clark [Tue, 20 May 2014 14:52:56 +0000 (10:52 -0400)]
freedreno: don't advertise texture arrays for now

I think a3xx and later should support (it is part of GLES3), but this
isn't needed for the time being and still needs to be reversed.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agoglapi: Avoid heap corruption in _glapi_table
Jeremy Huddleston Sequoia [Tue, 20 May 2014 08:37:58 +0000 (01:37 -0700)]
glapi: Avoid heap corruption in _glapi_table

Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Reviewed-by: Chia-I Wu <olv@lunarg.com>
10 years agofreedreno/a3xx: shadow sampler support
Rob Clark [Mon, 19 May 2014 21:56:11 +0000 (17:56 -0400)]
freedreno/a3xx: shadow sampler support

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx/compiler: refactor trans_samp()
Rob Clark [Mon, 19 May 2014 21:34:54 +0000 (17:34 -0400)]
freedreno/a3xx/compiler: refactor trans_samp()

Split it up into some smaller fxns so it doesn't grow into a huge
monster as we add things.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno: update generated headers
Rob Clark [Mon, 19 May 2014 21:28:31 +0000 (17:28 -0400)]
freedreno: update generated headers

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agometa: Avoid _swrast_BlitFramebuffer in the meta CopyTexSubImage code.
Kenneth Graunke [Mon, 19 May 2014 05:26:59 +0000 (22:26 -0700)]
meta: Avoid _swrast_BlitFramebuffer in the meta CopyTexSubImage code.

This is a replacement for bd44ac8b5ca08016bb064b37edaec95eccfdbcd5
that should actually work.

Fixes Piglit's copyteximage-border on swrast, as well as one of
es3conform's packed_pixels_pixelstore test.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78546
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77705
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
10 years agometa: Split _swrast_BlitFramebuffer out of the meta blit path.
Kenneth Graunke [Mon, 19 May 2014 05:16:01 +0000 (22:16 -0700)]
meta: Split _swrast_BlitFramebuffer out of the meta blit path.

Separating the software fallbacks from the rest of the meta path (which
is usually hardware accelerated) gives callers better control over their
blitting options.

For example, i965 might want to try meta blit, hardware blits, then
swrast as a last resort.  Splitting it makes that possible.

This updates all callers to maintain the existing behavior (even in the
few cases where it isn't desirable behavior - later patches can change
that).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
10 years agometa: Drop unnecessary early returns in _mesa_meta_BlitFramebuffer.
Kenneth Graunke [Mon, 19 May 2014 02:32:44 +0000 (19:32 -0700)]
meta: Drop unnecessary early returns in _mesa_meta_BlitFramebuffer.

These aren't necessary - all of the following code is predicated on mask
being non-zero, so no code will get executed anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Courtney Goeltzenleuchter <courtney@lunarg.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
10 years agoRevert "i965: Don't _swrast_BlitFramebuffer when doing CopyTexSubImage."
Kenneth Graunke [Mon, 19 May 2014 02:24:30 +0000 (19:24 -0700)]
Revert "i965: Don't _swrast_BlitFramebuffer when doing CopyTexSubImage."

This reverts commit bd44ac8b5ca08016bb064b37edaec95eccfdbcd5.

Fixes:
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78842
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78843

Re-breaks:
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77705
but that will be fixed properly in a few commits.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
10 years agodocs: update the prerequisites section
Brian Paul [Mon, 19 May 2014 13:54:30 +0000 (07:54 -0600)]
docs: update the prerequisites section

SCons is required for Windows.  Add links to flex/bison for Windows.
Reorder items and improve formatting.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoi965/fbo: Only try stencil meta blits on gen >= 8
Topi Pohjolainen [Mon, 19 May 2014 07:10:33 +0000 (10:10 +0300)]
i965/fbo: Only try stencil meta blits on gen >= 8

I don't have an ILK at hand but the fix should be trivial.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78872
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agomesa: Disable GL_EXT_framebuffer_multisample_blit_scaled on Broadwell.
Kenneth Graunke [Wed, 14 May 2014 01:53:28 +0000 (18:53 -0700)]
mesa: Disable GL_EXT_framebuffer_multisample_blit_scaled on Broadwell.

It's not properly implemented in the meta code, and we don't have time
to fix it for 10.2.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
10 years agollvmpipe: do IR counting for shader cache management after optimization.
Roland Scheidegger [Fri, 16 May 2014 20:45:27 +0000 (22:45 +0200)]
llvmpipe: do IR counting for shader cache management after optimization.

2ea923cf571235dfe573c35c3f0d90f632bd86d8 had the side effect of IR counting
now being done after IR optimization instead of before. Some quick analysis
shows that there's roughly 1.5 times more IR instructions before optimization
than after, hence the effective shader cache size got quite a bit smaller.
Could counter this with an increase of the instruction limit but it probably
makes more sense to count them after optimizations, so move that code.

Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agoi965: Rename brw_disasm to brw_disassemble_inst.
Vinson Lee [Mon, 19 May 2014 07:39:12 +0000 (00:39 -0700)]
i965: Rename brw_disasm to brw_disassemble_inst.

Fixes build error introduced with commit
4b04152db055babb8b06929a0c9ebea5c7f4fb92.

  CC       test_eu_compact.o
test_eu_compact.c: In function ‘test_compact_instruction’:
test_eu_compact.c:54:3: error: implicit declaration of function ‘brw_disasm’ [-Werror=implicit-function-declaration]
   brw_disasm(stderr, &src, brw->gen, false);
   ^

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78888
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
10 years agoi965: Fix a "discards 'const' qualifier" warning.
Kenneth Graunke [Mon, 19 May 2014 06:36:19 +0000 (23:36 -0700)]
i965: Fix a "discards 'const' qualifier" warning.

Trivial.

10 years agoi965/fs: Finally kill struct brw_wm_compile (better known as 'c').
Kenneth Graunke [Wed, 14 May 2014 08:35:30 +0000 (01:35 -0700)]
i965/fs: Finally kill struct brw_wm_compile (better known as 'c').

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Stop copying the program key.
Kenneth Graunke [Wed, 14 May 2014 08:32:54 +0000 (01:32 -0700)]
i965/fs: Stop copying the program key.

We already have a perfectly good copy of the program key, and nobody is
going to modify it.  The only reason we copied it was because the
brw_wm_compile structure embedded the key rather than pointing to it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Rip struct brw_wm_compile out of the visitors and generators.
Kenneth Graunke [Wed, 14 May 2014 07:41:41 +0000 (00:41 -0700)]
i965/fs: Rip struct brw_wm_compile out of the visitors and generators.

Instead, just pass the key and prog_data as separate parameters.

This moves it up a level - one step further toward getting rid of it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Plumb a mem_ctx all the way through the FS compile.
Kenneth Graunke [Wed, 14 May 2014 08:21:02 +0000 (01:21 -0700)]
i965/fs: Plumb a mem_ctx all the way through the FS compile.

'c' is going away, but we still need a memory context that lives
for the duration of the compile.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Use 'c' as the mem_ctx in fs_visitor.
Kenneth Graunke [Wed, 14 May 2014 08:07:32 +0000 (01:07 -0700)]
i965/fs: Use 'c' as the mem_ctx in fs_visitor.

Previously, the memory context situation was a bit of a mess:

fs_visitor allocated its own memory context, and freed it in the
destructor.  However, some data produced by fs_visitor (such as the list
of instructions) needs to live beyond when fs_visitor is "done", so the
caller can pass it to fs_generator.

Everything worked out because brw_wm_fs_emit's fs_visitor variables
happen to not go out of scope until the end of the function.  But that
meant that moving the declaration of, say, the SIMD16 fs_visitor
instance, could cause everything to explode.

Using a memory context that exists for the duration of the compile is
clearer, and should be equivalent.

Ultimately, we don't want to use 'c', but this matches the behavior of
fs_generator and gen8_fs_generator, so it'll be simple to change later.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Actually free program data on the error path.
Kenneth Graunke [Wed, 14 May 2014 08:04:02 +0000 (01:04 -0700)]
i965/fs: Actually free program data on the error path.

We throw away the data generated during compilation on the success path,
so we really ought to on the failure path as well.  The caller has no
access to it anyway, so it's purely leaked.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Replace c->key with a direct reference in the generators.
Kenneth Graunke [Wed, 14 May 2014 07:24:50 +0000 (00:24 -0700)]
i965/fs: Replace c->key with a direct reference in the generators.

'c' is going away.  This is also a bit shorter.

Marking the key pointer as const will also deter people from changing
it in these classes, as that's absolutely not OK.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Replace c->key with a direct reference in fs_visitor.
Kenneth Graunke [Wed, 14 May 2014 04:06:00 +0000 (21:06 -0700)]
i965/fs: Replace c->key with a direct reference in fs_visitor.

'c' is going away.  This is also shorter.

Marking the key pointer as const will also deter people from changing
it in fs_visitor, as it's absolutely not OK to modify it there.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Replace c->prog_data with a direct reference in the generators.
Kenneth Graunke [Wed, 14 May 2014 07:20:24 +0000 (00:20 -0700)]
i965/fs: Replace c->prog_data with a direct reference in the generators.

'c' is going away.  This is also a bit shorter.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Replace c->prog_data with a direct reference in fs_visitor.
Kenneth Graunke [Wed, 14 May 2014 07:17:03 +0000 (00:17 -0700)]
i965/fs: Replace c->prog_data with a direct reference in fs_visitor.

'c' is going away.  This is also a bit shorter.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Move some flags that affect code generation to fs_visitor.
Kenneth Graunke [Wed, 14 May 2014 07:08:58 +0000 (00:08 -0700)]
i965/fs: Move some flags that affect code generation to fs_visitor.

runtime_check_aads_emit isn't actually used currently, but I believe
we should be using it on Gen4-5, so I haven't eliminated it.
See https://bugs.freedesktop.org/show_bug.cgi?id=78679 for details.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Move payload register info from brw_wm_compile to fs_visitor.
Kenneth Graunke [Wed, 14 May 2014 04:52:51 +0000 (21:52 -0700)]
i965/fs: Move payload register info from brw_wm_compile to fs_visitor.

This data is created by fs_visitor and only used when emitting code,
so keeping it in fs_visitor makes sense.  I decided it would be
reasonable to group these all together in a struct, since they're
highly related.

v2: s/nr_payload_regs/payload.num_regs/ in some comments (chrisf).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Simplify gl_SampleMaskIn handling.
Kenneth Graunke [Wed, 14 May 2014 04:21:21 +0000 (21:21 -0700)]
i965/fs: Simplify gl_SampleMaskIn handling.

As far as I can tell, there's no point in allocating an extra register
and generating a MOV---we can just use the copy provided as part of our
thread payload directly.  It's already in the right format.

Of course, there are zero Piglit tests for this.  We don't actually ship
the extension (GL_ARB_gpu_shader5) that exposes this functionality
either.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Rename c->sample_mask_reg to sample_mask_in_reg.
Kenneth Graunke [Wed, 14 May 2014 04:36:28 +0000 (21:36 -0700)]
i965/fs: Rename c->sample_mask_reg to sample_mask_in_reg.

This is actually for gl_SampleMaskIn, which is quite different than
gl_SampleMask.  Renaming should help avoid confusion.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Move c->last_scratch into fs_visitor.
Kenneth Graunke [Wed, 14 May 2014 04:00:35 +0000 (21:00 -0700)]
i965/fs: Move c->last_scratch into fs_visitor.

Nothing outside of fs_visitor uses it, so we may as well keep it
internal.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Move total_scratch calculation into fs_visitor::run().
Kenneth Graunke [Wed, 14 May 2014 03:51:32 +0000 (20:51 -0700)]
i965/fs: Move total_scratch calculation into fs_visitor::run().

With this one use gone, c->last_scratch is now only used inside
fs_visitor.  The rest of the driver uses prog_data->total_scratch.

We already compute similar prog_data fields in fs_visitor, so this
seems reasonable.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Move perf_debug about register spilling to a more obvious spot.
Kenneth Graunke [Wed, 14 May 2014 03:41:27 +0000 (20:41 -0700)]
i965/fs: Move perf_debug about register spilling to a more obvious spot.

The if (!allocated_without_spills) block is an obvious spot for this
performance warning message.

In the Vec4 backend, scratch is also used for indirect access of
temporary arrays.  The FS backend doesn't implement that yet, but
if it did, this message would be inaccurate, since scratch access
wouldn't necessarily mean spilling.  Moving it preemptively fixes that.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965: Rename brw/gen8_dump_compile to brw/gen8_disassemble.
Kenneth Graunke [Thu, 15 May 2014 23:10:09 +0000 (16:10 -0700)]
i965: Rename brw/gen8_dump_compile to brw/gen8_disassemble.

"Disassemble" is an accurate description of what this function does.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965: Rename brw_disasm/gen8_disassemble to brw/gen8_disassemble_inst.
Kenneth Graunke [Thu, 15 May 2014 23:02:16 +0000 (16:02 -0700)]
i965: Rename brw_disasm/gen8_disassemble to brw/gen8_disassemble_inst.

We're going to use "disassemble" for the function that disassembles
the whole program.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965: Fix dump_prog_cache to handle compacted instructions.
Kenneth Graunke [Thu, 15 May 2014 22:58:07 +0000 (15:58 -0700)]
i965: Fix dump_prog_cache to handle compacted instructions.

dump_prog_cache has interpreted compacted instructions as full size
instructions, decoding garbage and complaining about invalid values.

We can just use brw_dump_compile to handle this correctly in less code.
The output format changes slightly, but it's still perfectly acceptable.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965: Use brw_dump_compile for clip, SF, and old GS programs.
Kenneth Graunke [Thu, 15 May 2014 21:12:48 +0000 (14:12 -0700)]
i965: Use brw_dump_compile for clip, SF, and old GS programs.

Looping over the instructions and calling brw_disasm doesn't handle
compacted instructions.  In most cases, this hasn't been a problem since
we don't compact prior to Sandybridge.

However, Sandybridge's transform feedback GS program should already be
compacted, and so this ought to fix decoding of that.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agonv50/ir: fix integer mul lowering for u32 x u32 -> high u32
Ilia Mirkin [Tue, 13 May 2014 15:23:33 +0000 (11:23 -0400)]
nv50/ir: fix integer mul lowering for u32 x u32 -> high u32

UNION appears to expect that all of its sources are conditionally
defined. Otherwise it inserts an unpredicated mov instruction which
overwrites the desired result. This fixes tests that use UMUL_HI, and
much less directly, unsigned integer division by a constant, which uses
this functionality in a peephole pass.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
10 years agonv50/ir: make sure that texprep/texquerylod's args get coalesced
Ilia Mirkin [Tue, 13 May 2014 05:31:20 +0000 (01:31 -0400)]
nv50/ir: make sure that texprep/texquerylod's args get coalesced

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
10 years agofreedreno/a3xx: use util_format_compose_swizzles()
Rob Clark [Sun, 18 May 2014 19:19:34 +0000 (15:19 -0400)]
freedreno/a3xx: use util_format_compose_swizzles()

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx/compiler: 1D textures
Rob Clark [Sat, 17 May 2014 17:49:52 +0000 (13:49 -0400)]
freedreno/a3xx/compiler: 1D textures

Gallium already gives us height==1 for these, so the texture state is
already setup correctly to emulate 1D textures as a Nx1 2D texture.  We
just need to supply the .y coord.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno: fix caps
Rob Clark [Sun, 18 May 2014 12:02:08 +0000 (08:02 -0400)]
freedreno: fix caps

In particular, we want mesa to emulate primitive restart for us.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno: fix index buffer offset
Rob Clark [Sat, 17 May 2014 17:50:10 +0000 (13:50 -0400)]
freedreno: fix index buffer offset

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx: add sRBG texture support
Rob Clark [Sat, 17 May 2014 00:29:44 +0000 (20:29 -0400)]
freedreno/a3xx: add sRBG texture support

That was easy.  Turns out it is just a matter of setting one bit.
Enable sampling from sRGB texture, and therefore enable GL 2.1 :-)

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno: update generated headers
Rob Clark [Sat, 17 May 2014 00:07:36 +0000 (20:07 -0400)]
freedreno: update generated headers

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agogallivm: (trivial) fix compilation with llvm 3.1, 3.2
Roland Scheidegger [Sat, 17 May 2014 00:03:35 +0000 (02:03 +0200)]
gallivm: (trivial) fix compilation with llvm 3.1, 3.2

I actually checked the getModuleIdentifier() function exists with 3.1 but
missed that the file moved...
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=78803

10 years agogallivm: print out how long it takes to optimize shader IR.
Roland Scheidegger [Thu, 15 May 2014 23:01:07 +0000 (01:01 +0200)]
gallivm: print out how long it takes to optimize shader IR.

Enabled with GALLIVM_DEBUG=perf (which up to now was only used to print
warnings for unoptimized code).

While some unexpectedly long shader compile times for some shaders were fixed
with 8a9f5ecdb116d0449d63f7b94efbfa8b205d826f this should help recognize such
problems in the future. For now though only available in debug builds (which
are not always suitable for such analysis). And since this uses system time,
it might not be all that accurate (even llvmpipe's own rasterization threads
might be running at the same time, or just other tasks).
(llvmpipe also has LP_DEBUG=counters but this only gives an average per shader
and the the total time for all shaders.)
This prints information like this:
optimizing module fs17_variant0 took 1 msec
optimizing module setup_variant_0 took 0 msec
optimizing module draw_llvm_vs_variant0 took 9 msec
optimizing module draw_llvm_vs_variant0 took 12 msec
optimizing module fs17_variant1 took 2 msec

v2: rebase for recent gallivm compilation changes, and print time for whole
modules instead of functions (otherwise it would be very spammy since it would
include all trivial inline sse2 functions), using the shiny new module names,
prying them off LLVM using new helper (not available through C bindings).
Per function timings, while possibly giving more information (if there'd be
a problem only in for instance the partial not the whole function), don't seem
all that useful for now.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
10 years agogallivm: give more verbose names to modules
Roland Scheidegger [Thu, 15 May 2014 23:00:53 +0000 (01:00 +0200)]
gallivm: give more verbose names to modules

When we had just one module "gallivm" was an appropriate name. But now we have
modules containing all functions for a particular variant, so give it a
corresponding name (this is really just for helping debugging).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
10 years agomesa: fix double-freeing of dispatch tables inside glBegin/End.
Brian Paul [Thu, 15 May 2014 21:49:14 +0000 (15:49 -0600)]
mesa: fix double-freeing of dispatch tables inside glBegin/End.

We allocate dispatch tables for BeginEnd and OutsideBeginEnd.  But
when we destroy the context we were freeing the BeginEnd and Exec
tables.  If Exec==BeginEnd we did a double-free.  This would happen
if the context was destroyed while inside a glBegin/End pair.  Now
free the BeginEnd and OutsideBeginEnd pointers.

Cc: "10.1", "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agoi965: Use binary literals counter select.
Matt Turner [Tue, 4 Mar 2014 03:10:44 +0000 (19:10 -0800)]
i965: Use binary literals counter select.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoglsl_to_tgsi: Make sure the 'shader' member is always initialized
Michel Dänzer [Thu, 15 May 2014 03:23:16 +0000 (12:23 +0900)]
glsl_to_tgsi: Make sure the 'shader' member is always initialized

Fixes the valgrind report below and random crashes with piglit on radeonsi.

==30005== Conditional jump or move depends on uninitialised value(s)
==30005==    at 0xB13584E: st_translate_program (st_glsl_to_tgsi.cpp:5100)
==30005==    by 0xB14698B: st_translate_fragment_program (st_program.c:747)
==30005==    by 0xB14777D: st_get_fp_variant (st_program.c:824)
==30005==    by 0xB11219C: get_color_fp_variant (st_cb_drawpixels.c:1042)
==30005==    by 0xB1131AE: st_DrawPixels (st_cb_drawpixels.c:1154)
==30005==    by 0xAFF8806: _mesa_DrawPixels (drawpix.c:162)
==30005==    by 0x4EB86DB: stub_glDrawPixels (generated_dispatch.c:6640)
==30005==    by 0x4F1DF08: piglit_visualize_image (piglit-util-gl.c:1574)
==30005==    by 0x40691D: draw_image_to_window_system_fb(int, bool) (draw-buffers-common.cpp:733)
==30005==    by 0x406C8B: draw_reference_image(bool, bool) (draw-buffers-common.cpp:854)
==30005==    by 0x40722A: piglit_display (alpha-to-coverage-dual-src-blend.cpp:117)
==30005==    by 0x4EA7168: run_test (piglit_fbo_framework.c:52)

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agogallivm: remove optimization workaround when not having sse 4.1
Roland Scheidegger [Thu, 15 May 2014 15:01:40 +0000 (17:01 +0200)]
gallivm: remove optimization workaround when not having sse 4.1

This workaround doesn't list any llvm version, but it was introduced
2010-06-10 (e277d5c1f6b2c5a6d202561e67d2b6821a69ecc4). It is unlikely
this bug is still present in llvm versions we support (3.1+).
There's no specific test listed, but I ran lp_test_arit (which uses
the mentioned functions) on llvm 3.1 and 3.3 with sse41 disabled and
this pass enabled without issues.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
10 years agogallivm: remove workaround for reversing optimization pass order.
Roland Scheidegger [Thu, 15 May 2014 14:26:00 +0000 (16:26 +0200)]
gallivm: remove workaround for reversing optimization pass order.

32bit code generation and llvm >= 2.7 used a different optimization pass
order - this code was initially introduced (2010-07-23) by
815e79e72c1f4aa849c0ee6103621685b678bc9d, apparently due to buggy code being
generated with then brand new llvm versions (which was llvm 2.7 plus pre 2.8
devel).
It seems very highly likely that whatever this bug was it has been fixed in
newer llvm versions, though there's no easy way to test this - the mentioned
piglit test has been removed years ago, and even if you'd build it I'm
sceptical the glsl compiler would still produce the required code to trigger
it.
I have no idea what a good order of passes is, but just remove the workaround
and use the same order everywhere.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
10 years agoi965/gen8: Make disassembly function match brw's signature.
Matt Turner [Fri, 9 May 2014 00:27:31 +0000 (17:27 -0700)]
i965/gen8: Make disassembly function match brw's signature.

gen8_dump_compile will be called indirectly by code common used by
generations before and after the gen8 instruction format change.

Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Pass brw_context and assembly separately to brw_dump_compile.
Matt Turner [Fri, 9 May 2014 23:15:30 +0000 (16:15 -0700)]
i965: Pass brw_context and assembly separately to brw_dump_compile.

brw_dump_compile will be called indirectly by code common used by
generations before and after the gen8 instruction format change.

Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Pull brw_compact_instructions() out of brw_get_program().
Matt Turner [Wed, 7 May 2014 18:53:22 +0000 (11:53 -0700)]
i965: Pull brw_compact_instructions() out of brw_get_program().

Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965/disasm: Align send instruction meta-information with dst.
Matt Turner [Thu, 8 May 2014 23:06:33 +0000 (16:06 -0700)]
i965/disasm: Align send instruction meta-information with dst.

Has been misaligned since we added instruction offset prefixes.

Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965/disasm: Disassemble the compaction control bit.
Matt Turner [Thu, 1 May 2014 18:20:25 +0000 (11:20 -0700)]
i965/disasm: Disassemble the compaction control bit.

brw_disasm doesn't disassemble compacted instructions, so we uncompact
before disassembling them which would unset the compaction control bit.
Instead pass it as a separate argument.

Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>