Eric Anholt [Thu, 7 Nov 2013 20:15:13 +0000 (12:15 -0800)]
glsl: Apply the transformation "1/rsq(x) == sqrt(x)" in opt_algebraic.
The comment was stale, because the lowering in question wasn't happening
in lower_instructions.cpp. Presumably if the lowering ever moves there,
we can plumb the lowering mask through to opt_algebraic.
total instructions in shared programs:
1618696 ->
1616810 (-0.12%)
instructions in affected programs: 243018 -> 241132 (-0.78%)
GAINED: 0
LOST: 0
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Eric Anholt [Thu, 7 Nov 2013 20:10:25 +0000 (12:10 -0800)]
glsl: Apply the transformation "(a ^^ a) -> false" in opt_algebraic.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Eric Anholt [Thu, 31 Oct 2013 16:32:42 +0000 (09:32 -0700)]
glsl: Apply the transformation "(a && a) -> a" in opt_algebraic.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Eric Anholt [Thu, 31 Oct 2013 07:10:32 +0000 (00:10 -0700)]
glsl: Apply the transformation "(a || a) -> a" in opt_algebraic.
total instructions in shared programs:
1732385 ->
1732373 (-0.00%)
instructions in affected programs: 416 -> 404 (-2.88%)
GAINED: 0
LOST: 0
(That's 4 already-short fragment shaders in dota2)
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Eric Anholt [Thu, 31 Oct 2013 06:56:18 +0000 (23:56 -0700)]
glsl: Move the CSE equality functions to the ir class.
I want to reuse them in opt_algebraic.
v2: Merge in Chris Forbes's break fix.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Matt Turner [Mon, 11 Nov 2013 23:54:16 +0000 (15:54 -0800)]
clover: Remove dead file from Makefile.sources.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Kenneth Graunke [Wed, 16 Oct 2013 02:23:53 +0000 (19:23 -0700)]
i965: Rework brw_new_batch to actually start a new batch.
Previously, brw_new_batch was called just after execbuf, but before
intel_batchbuffer_reset. Essentially, it prepared for the creation of a
new batch, that wasn't yet available, and which it didn't create. This
was a bit awkward.
This patch makes brw_new_batch call intel_batchbuffer_reset as the very
first operation. This means that brw_new_batch actually creates a new
batchbuffer, and thus has it available. It brings the creation of the
new batchbuffer and BRW_NEW_BATCH flagging together into one place.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Wed, 16 Oct 2013 02:21:34 +0000 (19:21 -0700)]
i965: Move cache_used_by_gpu flag setting to brw_finish_batch.
It really makes more sense here.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Ian Romanick [Mon, 11 Nov 2013 19:12:08 +0000 (11:12 -0800)]
i915: Actually enable __DRI2rendererQueryExtensionRec
More rebase fail. This code was written long before i915 and i965 were
split, so most of the code in i9[16]5/intel_screen.c only needed to
exist in one place. It looks like I fixed n-1 of those places after
rebasing on the split.
I only found this from the defined-but-not-used warning for
intelRendererQueryExtension. I noticed this while fixing the other,
related warnings.
(Note: During review, we decided to *not* pick this back to 10.0.)
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
Aaron Watry [Thu, 14 Nov 2013 18:17:44 +0000 (12:17 -0600)]
radeon/llvm: Free elf_buffer after use
Prevents a memory leak.
v2: Remove null check
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Aaron Watry [Thu, 14 Nov 2013 18:17:43 +0000 (12:17 -0600)]
r600/llvm: Free binary.code/binary.config in r600_llvm_compile
radeon_llvm_compile allocates memory for binary.code, binary.config,
or neither depending on what's being done.
We need to make sure to free that memory after it's no longer needed.
v2: Don't bother checking for null before FREE()
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Aaron Watry [Thu, 14 Nov 2013 18:17:42 +0000 (12:17 -0600)]
r600/llvm: initialize radeon_llvm_binary
use memset to initialize to 0's... otherwise code_size and config_size
could be uninitialized when read later in this method.
It's also hard to do NULL checks on uninitialized pointers.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
v2: Fix indentation
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Brian Paul [Fri, 15 Nov 2013 17:25:19 +0000 (10:25 -0700)]
svga: remove unused vars in svga_hwtnl_simple_draw_range_elements()
And simplify the code.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Brian Paul [Thu, 14 Nov 2013 20:41:19 +0000 (13:41 -0700)]
svga: print warning for unsupported indirect dest reg indexing
For DX9-level shaders, there's only limited support for indirect
indexing of registers (with the loop counter register, not the
general address register.)
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Thu, 14 Nov 2013 20:33:52 +0000 (13:33 -0700)]
svga: mark dest image as defined in svga_surface_copy()
After we blit/copy to a dest texture image we need to mark it as
being defined. This fixes broken mipmap generation for quite a
few texture formats. Mipgen involves making texture views and
svga_texture_view_surface() skips texture images that are undefined.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Wed, 13 Nov 2013 18:26:15 +0000 (11:26 -0700)]
svga: do primitive trimming in translate_indices()
The index translation code expects the number of indexes to be
consistent with the primitive type (ex: a multiple of 3 for
PIPE_PRIM_TRIANGLES). If it's not, we can write out of bounds
in the destination buffer.
Fixes failed assertions in the pipebuffer debug code found with
Piglit primitive-restart-draw-mode test.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Wed, 13 Nov 2013 18:24:41 +0000 (11:24 -0700)]
indices: add comments, assertions in u_indices.c file
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Tue, 12 Nov 2013 22:09:44 +0000 (15:09 -0700)]
mesa: remove duplicated prototypes in varray.h
Aaron Watry [Wed, 6 Nov 2013 22:49:24 +0000 (16:49 -0600)]
gallium/pipe_loader: un-reference udev resources when we're done with them.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Aaron Watry [Wed, 6 Nov 2013 22:49:23 +0000 (16:49 -0600)]
radeonsi/compute: Dispose of LLVM module after compiling kernels
v2: Fix indentation
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Aaron Watry [Wed, 6 Nov 2013 22:49:22 +0000 (16:49 -0600)]
radeonsi/compute: Free program and program.kernels on shutdown
v2: Fix indentation
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Aaron Watry [Wed, 6 Nov 2013 22:49:21 +0000 (16:49 -0600)]
radeon/llvm: Free created llvm memory buffer
v2: Fix indentation
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Aaron Watry [Wed, 6 Nov 2013 22:49:20 +0000 (16:49 -0600)]
radeon/llvm: Free libelf resources
v2: Fix indentation
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Aaron Watry [Wed, 6 Nov 2013 22:49:19 +0000 (16:49 -0600)]
radeon/llvm: fix spelling error
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Tom Stellard [Thu, 11 Apr 2013 14:37:55 +0000 (10:37 -0400)]
clover: Support multiple devices in clCreateContextFromType() v2
v2:
- Use clGetDeviceIDs to query devices.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Paul Berry [Tue, 29 Oct 2013 21:41:32 +0000 (14:41 -0700)]
glsl: Rework interface block linking.
Previously, when doing intrastage and interstage interface block
linking, we only checked the interface type; this prevented us from
catching some link errors.
We now check the following additional constraints:
- For intrastage linking, the presence/absence of interface names must
match.
- For shader ins/outs, the interface names themselves must match when
doing intrastage linking (note: it's not clear from the spec whether
this is necessary, but Mesa's implementation currently relies on
it).
- Array vs. nonarray must be consistent, taking into account the
special rules for vertex-geometry linkage.
- Array sizes must be consistent (exception: during intrastage
linking, an unsized array matches a sized array).
Note: validate_interstage_interface_blocks currently handles both
uniforms and in/out variables. As a result, if all three shader types
are present (VS, GS, and FS), and a uniform interface block is
mentioned in the VS and FS but not the GS, it won't be validated. I
plan to address this in later patches.
Fixes the following piglit tests in spec/glsl-1.50/linker:
- interface-blocks-vs-fs-array-size-mismatch
- interface-vs-array-to-fs-unnamed
- interface-vs-unnamed-to-fs-array
- intrastage-interface-unnamed-array
v2: Simplify logic in intrastage_match() for handling array sizes.
Make extra_array_level const. Use an unnamed temporary
interface_block_definition in validate_interstage_interface_blocks()'s
first call to definitions->store().
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Paul Berry [Tue, 12 Nov 2013 18:55:18 +0000 (10:55 -0800)]
i965: Fix vertical alignment for multisampled buffers.
From the Sandy Bridge PRM, Vol 1 Part 1 7.18.3.4 (Alignment Unit
Size):
j [vertical alignment] = 4 for any render target surface is
multisampled (4x)
From the Ivy Bridge PRM, Vol 4 Part 1 2.12.2.1 (SURFACE_STATE for most
messages), under the "Surface Vertical Alignment" heading:
This field is intended to be set to VALIGN_4 if the surface was
rendered as a depth buffer, for a multisampled (4x) render target,
or for a multisampled (8x) render target, since these surfaces
support only alignment of 4.
Back in 2012 when we added multisampling support to the i965 driver,
we forgot to update the logic for computing the vertical alignment, so
we were often using a vertical alignment of 2 for multisampled
buffers, leading to subtle rendering errors.
Note that the specs also require a vertical alignment of 4 for all
Y-tiled render target surfaces; I plan to address that in a separate
patch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53077
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Paul Berry [Wed, 13 Nov 2013 22:24:09 +0000 (14:24 -0800)]
main: Fix MaxUniformComponents for geometry shaders.
For both vertex and fragment shaders we default MaxUniformComponents
to 4 * MAX_UNIFORMS. It makes sense to do this for geometry shaders
too; if back-ends have different limits they can override them as
necessary.
Fixes piglit test:
spec/glsl-1.50/built-in constants/gl_MaxGeometryUniformComponents
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
José Fonseca [Fri, 15 Nov 2013 15:42:02 +0000 (15:42 +0000)]
tools/trace: Several bugfixes/improvements to dump_state.py
- Don't crash with user memory pointers.
- Support old bind_*_sampler_* methods. Useful when comparing dumps
from old branches.
- Misc.
José Fonseca [Fri, 15 Nov 2013 15:32:33 +0000 (15:32 +0000)]
trace: Dump user_buffer members.
Fredrik Höglund [Mon, 11 Nov 2013 17:54:15 +0000 (18:54 +0100)]
mesa: Fix derived vertex state not being updated in glCallList()
AEcontext::NewState is not always set when the vertex array state
is changed.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71492
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Alex Deucher [Tue, 24 Sep 2013 16:13:42 +0000 (12:13 -0400)]
radeonsi: add Hawaii pci ids
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 24 Sep 2013 16:12:29 +0000 (12:12 -0400)]
radeonsi: add support for Hawaii asics (v2)
Update additional register fields.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Vinson Lee [Fri, 15 Nov 2013 06:33:56 +0000 (22:33 -0800)]
i965: Initialize schedule_node::delay.
Fixes "Uninitialized scalar field" defect reported by Coverity.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Alexander von Gluck IV [Wed, 13 Nov 2013 23:51:00 +0000 (23:51 +0000)]
haiku/swrast: Inherit gl_config, fix flush
* Inherit gl_context so we always have access to it
* Thanks curro for the idea.
* Last Haiku cannidate for 10.0.0
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Roland Scheidegger [Thu, 14 Nov 2013 15:48:30 +0000 (15:48 +0000)]
llvmpipe: (trivial) fix more fallout from the setup cleanup.
Oops... Should have done some more testing.
Roland Scheidegger [Thu, 14 Nov 2013 14:42:28 +0000 (14:42 +0000)]
llvmpipe: (trivial) fix misplaced bld context assignment.
Should fix polygon offset crashes...
José Fonseca [Thu, 14 Nov 2013 14:02:24 +0000 (14:02 +0000)]
gallivm: Compile flag to debug TGSI execution through printfs.
It is similar to tgsi_exec.c's DEBUG_EXECUTION compile flag.
I had prototyped this for a while while debugging an issue, but finally
cleaned this up and added a few more bells and whistles.
v2: Use '$' as marker; better output. Thanks to Brian, Zack and Roland
reviews.
Here is a sample output.
CONST[0].x = 0.
00625000009 0.
00625000009 0.
00625000009 0.
00625000009
CONST[0].y = -0.
00714285718 -0.
00714285718 -0.
00714285718 -0.
00714285718
CONST[0].z = -1 -1 -1 -1
CONST[0].w = 1 1 1 1
IN[0].x = 143.5 175.5 175.5 143.5
IN[0].y = 123.5 123.5 155.5 155.5
IN[0].z = 0 0 0 0
IN[0].w = 1 1 1 1
$ 1: RCP TEMP[0].w, IN[0].wwww
TEMP[0].w = 1 1 1 1
$ 2: MAD TEMP[0].xy, IN[0], CONST[0], CONST[0].zwzw
TEMP[0].x = -0.
103124976 0.
0968750715 0.
0968750715 -0.
103124976
TEMP[0].y = 0.
117857158 0.
117857158 -0.
110714316 -0.
110714316
$ 3: MUL OUT[0].xy, TEMP[0], TEMP[0].wwww
OUT[0].x = -0.
103124976 0.
0968750715 0.
0968750715 -0.
103124976
OUT[0].y = 0.
117857158 0.
117857158 -0.
110714316 -0.
110714316
$ 4: MUL OUT[0].z, IN[0].zzzz, TEMP[0].wwww
OUT[0].z = 0 0 0 0
$ 5: MOV OUT[0].w, TEMP[0]
OUT[0].w = 1 1 1 1
$ 6: END
OUT[0].x = -0.
103124976 0.
0968750715 0.
0968750715 -0.
103124976
OUT[0].y = 0.
117857158 0.
117857158 -0.
110714316 -0.
110714316
OUT[0].z = 0 0 0 0
OUT[0].w = 1 1 1 1
Roland Scheidegger [Thu, 14 Nov 2013 12:21:02 +0000 (12:21 +0000)]
softpipe: (trivial) fix debug code
The debug printfs wouldn't actually compile when enabled, so kill them off
and insert some new one in another place, and make sure it keeps compiling
by enclosing it in a if-0 clause.
Roland Scheidegger [Tue, 12 Nov 2013 20:02:15 +0000 (20:02 +0000)]
llvmpipe: clean up state setup code a bit
In particular get rid of home-grown vector helpers which didn't add much.
And while here fix formatting a bit. No functional change.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Roland Scheidegger [Mon, 11 Nov 2013 14:29:25 +0000 (14:29 +0000)]
gallivm,llvmpipe: fix float->srgb conversion to handle NaNs
d3d10 requires us to convert NaNs to zero for any float->int conversion.
We don't really do that but mostly seems to work. In particular I suspect the
very common float->unorm8 path only really passes because it relies on sse2
pack intrinsics which just happen to work by luck for NaNs (float->int
conversion in hw gives integer indeterminate value, which just happens to be
-0x80000000 hence gets converted to zero in the end after pack intrinsics).
However, float->srgb didn't get so lucky, because we need to clamp before
blending and clamping resulted in NaN behavior being undefined (and actually
got converted to 1.0 by clamping with sse2). Fix this by using a zero/one clamp
with defined nan behavior as we can handle the NaN for free this way.
I suspect there's more bugs lurking in this area (e.g. converting floats to
snorm) as we don't really use defined NaN behavior everywhere but this seems
to be good enough.
While here respecify nan behavior modes a bit, in particular the return_second
mode didn't really do what we wanted. From the caller's perspective, we really
wanted to say we need the non-nan result, but we already know the second arg
isn't a NaN. So we use this now instead, which means that cpu architectures
which actually implement min/max by always returning non-nan (that is adhering
to ieee754-2008 rules) don't need to bend over backwards for nothing.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Ian Romanick [Mon, 11 Nov 2013 19:08:26 +0000 (11:08 -0800)]
dri: Change value param to unsigned
This silences some compiler warnings in i915 and i965. See also
75982a5.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Ian Romanick [Mon, 11 Nov 2013 18:57:55 +0000 (10:57 -0800)]
i965: Use drm_intel_get_aperture_sizes instead of hard-coded 2GiB
Systems with little physical memory installed will report less than
2GiB, and some systems may (hypothetically?) have a larger address space
for the GPU. My IVB still reports 1534.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Ian Romanick [Mon, 11 Nov 2013 18:55:34 +0000 (10:55 -0800)]
i915: Use drm_intel_get_aperture_sizes instead of drmAgpSize
Send the zombie back to the grave before it infects the townsfolk.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Alexander Monakov [Sun, 3 Nov 2013 21:34:32 +0000 (01:34 +0400)]
i965: implement blit path for PBO glDrawPixels
This patch implements accelerated path for glDrawPixels from a PBO in
i965. The code follows what intel_pixel_read, intel_pixel_copy,
intel_pixel_bitmap and intel_tex_image are doing. Piglit quick.tests
show no regressions. In my testing on IVB, performance improvement is
huge (about 30x, didn't measure exactly) since generic path goes via
_mesa_unpack_color_span_float, memcpy, extract_float_rgba.
Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Brian Paul [Wed, 13 Nov 2013 17:06:23 +0000 (10:06 -0700)]
docs: fill in md5 checksums for 9.2.3 release
Brian Paul [Wed, 13 Nov 2013 17:00:46 +0000 (10:00 -0700)]
docs: fix 9.2.2 -> 9.2.3 typos
Alexander von Gluck IV [Wed, 13 Nov 2013 05:39:19 +0000 (05:39 +0000)]
haiku: add swrast driver
* This is pretty small and upkeep should be minimal.
* Currently fully working.
* Cannidate for 10.0.0 branch
Acked-by: Brian Paul <brianp@vmware.com>
Carl Worth [Wed, 13 Nov 2013 15:31:42 +0000 (07:31 -0800)]
docs: Import 9.2.3 release notes, add news item.
Kristian Høgsberg [Tue, 12 Nov 2013 00:35:35 +0000 (16:35 -0800)]
dri: Remove redundant createNewContext function from __DRIimageDriverExtension
createContextAttribs is a superset of what createNewContext provides.
Also remove the function typedef, since createNewContext is deprecated
and no longer used in multiple interfaces.
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Kristian Høgsberg [Sat, 9 Nov 2013 06:10:36 +0000 (22:10 -0800)]
wayland: Use __DRIimage based getBuffers implementation when available
This lets us allocate color buffers as __DRIimages and pass them into
the driver instead of having to create a __DRIbuffer with the flink
that requires.
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Kristian Høgsberg [Sat, 9 Nov 2013 06:06:51 +0000 (22:06 -0800)]
gbm: Add support for __DRIimage based getBuffers when available
This lets us allocate color buffers as __DRIimages and pass them into
the driver instead of having to create a __DRIbuffer with the flink
that requires.
With this patch, we can now run gbm on render-nodes. A render-node is a
drm device that doesn't support modesetting and all the legacy DRI ioctls.
flink is also not supported, but now that gbm doesn't need flink, we can
run piglit on head-less gbm or head-less GPGPU.
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Ander Conselvan de Oliveira [Tue, 12 Nov 2013 12:47:08 +0000 (14:47 +0200)]
dri/i915, dri/i965: Fix support for planar images
Planar images have format __DRI_IMAGE_FORMAT_NONE, but the patch that
moved the conversion from dri_format to the mesa format made it
impossible to allocate a image with that format.
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Eric Anholt [Thu, 7 Nov 2013 01:38:23 +0000 (17:38 -0800)]
i965/fs: Try a different pre-scheduling heuristic if the first spills.
Since LIFO fails on some shaders in one particular way, and non-LIFO
systematically fails in another way on different kinds of shaders, try
them both, and pick whichever one successfully register allocates first.
Slightly prefer non-LIFO in case we produce extra dependencies in register
allocation, since it should start out with fewer stalls than LIFO.
This is madness, but I haven't come up with another way to get unigine
tropics to not spill while keeping other programs from not spilling and
retaining the non-unigine performance wins from texture-grf.
total instructions in shared programs:
1626728 ->
1626288 (-0.03%)
instructions in affected programs: 1015 -> 575 (-43.35%)
GAINED: 50
LOST: 0
Improves Unigine Tropics performance by 14.5257% +/- 0.241838% (n=38)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70445
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Eric Anholt [Thu, 7 Nov 2013 01:43:25 +0000 (17:43 -0800)]
i965/fs: Do instruction pre-scheduling just before register allocation.
Long ago, the HW_REG usage in assign_curb/urb_setup() were scheduling
barriers, so we had to run scheduler before them in order for it to be
able to do basically anything. Now that that's fixed, we can delay the
scheduling until we go to allocate (which will make the next change less
scary).
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Eric Anholt [Wed, 6 Nov 2013 07:30:33 +0000 (23:30 -0800)]
i965/fs: Ignore actual latency pre-reg-alloc.
We care about depth-until-program-end, as a proxy for "make sure I
schedule those early instructions that open up the other things that can
make progress while keeping register pressure low", not actual latency
(since we're relying on the post-register-alloc scheduling to actually
schedule for the hardware).
total instructions in shared programs:
1609931 ->
1609931 (0.00%)
instructions in affected programs: 0 -> 0
GAINED: 55
LOST: 43
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Eric Anholt [Tue, 5 Nov 2013 06:56:33 +0000 (22:56 -0800)]
i965/fs: Fix message setup for SIMD8 spills.
In the SIMD16 spilling changes, I replaced a "1" in the spill path with
"mlen", but obviously it wasn't mlen before because spills have the g0
header along with the payload. The interface I was trying to use was
asking for how many physical regs we're writing, so we're looking for "1"
or "2".
I'm guessing this actually passed piglit because the high 8 bits of the
execution mask in SIMD8 mode are all 0s.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Eric Anholt [Mon, 14 Oct 2013 18:38:09 +0000 (11:38 -0700)]
i965/fs: Prefer things we know reduce reg pressure when pre-scheduling.
Previously, the best thing we had was to schedule the things unblocked by
the last chosen instruction, on the hope that it would be consuming two
values at the end of their live intervals while only producing one new
value. But that's just a guess, and we can do counting of usage of
registers to know when an instruction would (almost surely) reduce
register pressure.
The only failure mode I know of in this new dominant heuristic is that
inside of a loop when scheduling the iterator (for example), choosing the
last use of the iterator doesn't actually reduce the live interval of the
iterator. But it doesn't seem to matter in shader-db:
total instructions in shared programs:
1618700 ->
1618700 (0.00%)
instructions in affected programs: 0 -> 0
GAINED: 13
LOST: 0
Note: The new functions are made virtual because I expect we'll soon lift
the pre-regalloc scheduling heuristic over to the vec4 backend.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Eric Anholt [Wed, 6 Nov 2013 00:24:58 +0000 (16:24 -0800)]
i965: Fix undefined value usage in ABO setup.
Fixes a compiler warning.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Eric Anholt [Thu, 31 Oct 2013 17:14:17 +0000 (10:14 -0700)]
i965: Add a warning if something ever hits a bug I noticed.
We'd have to map the VBO and rewrite things to a lower stride to fix it.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Ben Skeggs [Tue, 12 Nov 2013 07:58:18 +0000 (17:58 +1000)]
nvc0: release 3d bufctx after drawing
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Francisco Jerez [Tue, 12 Nov 2013 19:14:20 +0000 (11:14 -0800)]
clover: Fix the const variant of adaptor_range::end to deal with mismatching range sizes.
Fixes infinite loop in find_grid_optimal_factor() in cases where the
user specifies a grid size with less dimensions than the device
supports.
Reported-by: Tom Stellard <thomas.stellard@amd.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Roland Scheidegger [Mon, 11 Nov 2013 15:11:59 +0000 (15:11 +0000)]
draw,llvmpipe: use exponent manipulation instead of exp2 for polygon offset
Since we explicitly require a integer input we should avoid using exp2 math
(even if we were using optimized versions), which turns the exp2 into a int
sub (plus some casts).
v2: fix bogus uint (needs to be int) math spotted by Matthew, fix comments
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Cyril Brulebois [Tue, 12 Nov 2013 09:51:00 +0000 (02:51 -0700)]
gallium: fix build on GNU/Hurd due to missing PIPE_OS_HURD detection
Thanks to Pino Toscano. Patch from Debian package.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Petr Sebor [Mon, 11 Nov 2013 23:19:00 +0000 (16:19 -0700)]
meta: enable vertex attributes in the context of the newly created array object
Otherwise, the function would enable generic vertex attributes 0
and 1 of the array object it does not own. This was causing crashes
in Euro Truck Simulator 2, since the incorrectly enabled generic
attribute 0 in the foreign context got precedence before vertex
position attribute at later time, leading to NULL pointer dereference.
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Petr Sebor <petr@scssoft.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Brian Paul [Mon, 11 Nov 2013 20:52:45 +0000 (13:52 -0700)]
mesa: 80-column wrapping, remove trailing whitespace in arrayobj.c
Brian Paul [Mon, 11 Nov 2013 18:51:55 +0000 (11:51 -0700)]
mesa: add comment for struct gl_vertex_buffer_binding
Brian Paul [Mon, 11 Nov 2013 22:06:13 +0000 (15:06 -0700)]
mesa: call update_array_format() after error checking
We try to do all error checking before changing any GL state.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Jordan Justen <jordan.l.justen@intel.com>
Brian Paul [Mon, 11 Nov 2013 17:57:23 +0000 (10:57 -0700)]
mesa: use _mesa_is_bufferobj() helper in _mesa_vertex_attrib_address()
And use a regular if statment to slightly improve readability.
Jordan Justen <jordan.l.justen@intel.com>
Brian Paul [Mon, 11 Nov 2013 17:56:09 +0000 (10:56 -0700)]
mesa: add const qualifiers to vertex array helper functions
Jordan Justen <jordan.l.justen@intel.com>
Ilia Mirkin [Sat, 9 Nov 2013 18:29:35 +0000 (13:29 -0500)]
nouveau/video: mark bitstream-level acceleration as unsupported
Adding a vl_mpeg-based helper didn't seem to work, as it produced data
that the card couldn't handle. (And I didn't investigate further.) This
makes the decoding functionality only accessible via XvMC and avoids
crashes when attempting to use VDPAU.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Ilia Mirkin [Sat, 9 Nov 2013 18:29:34 +0000 (13:29 -0500)]
nouveau/video: don't try on nv3x
It doesn't work, I don't know why, but no point in hanging people's
displays until it gets figured out.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Tom Stellard [Wed, 23 Oct 2013 20:02:16 +0000 (16:02 -0400)]
egl-static: Only export necessary symbols v3
This fixes a crash in glamor when mesa links against static LLVM.
v2:
- Inline LINKER_SCRIPT variable
v3: Kai Wasserbäch
- Fix out out-of-tree-builds
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.or>
Tom Stellard [Wed, 23 Oct 2013 19:36:41 +0000 (15:36 -0400)]
configure.ac: Don't require shared LLVM when building OpenCL
This works now that pipe_*.so is no longer exporting LLVM symbols.
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.or>
Tom Stellard [Wed, 23 Oct 2013 19:35:45 +0000 (15:35 -0400)]
pipe-loader: Only export necessary symbols v3
This makes it possible to use clover with statically linked LLVM.
v2:
- Inline LINKER_SCRIPT variable
v3: Kai Wasserbäch
- Fix out out-of-tree-builds
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.or>
Tom Stellard [Wed, 16 Oct 2013 17:43:08 +0000 (13:43 -0400)]
radeonsi/compute: Add Sea Islands support
Vincent Lejeune [Wed, 30 Oct 2013 17:35:58 +0000 (18:35 +0100)]
r600/llvm: Store inputs in function arguments
Rico Schüller [Mon, 11 Nov 2013 20:18:27 +0000 (21:18 +0100)]
tests: Fix make check for out of tree builds.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Rico Schüller <kgbricola@web.de>
Anuj Phogat [Fri, 8 Nov 2013 00:27:25 +0000 (16:27 -0800)]
i965: Move #define's inside function as local variables
X_f, Y_f, Xp_f, Yp_f variables are used just inside
translate_dst_to_src().So, they can be defined just
as local variables.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Vinson Lee [Sat, 28 Sep 2013 05:20:04 +0000 (22:20 -0700)]
i915, i965: Fix memory leak in intel_miptree_create_for_bo.
Fixes "Resource leak" defects reported by Coverity.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Brian Paul [Fri, 8 Nov 2013 02:47:13 +0000 (19:47 -0700)]
osmesa: assorted code clean-ups
Brian Paul [Fri, 8 Nov 2013 02:01:39 +0000 (19:01 -0700)]
osmesa: fix broken triangle/line drawing when using float color buffer
Doesn't seem to help with bug 71363 but it fixed a failure I found in
my testing.
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Brian Paul [Fri, 8 Nov 2013 00:28:33 +0000 (17:28 -0700)]
svga: improve loops over color buffers
Only loop over the actual number of color buffers supported, not
PIPE_MAX_COLOR_BUFS.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Thu, 7 Nov 2013 23:57:23 +0000 (16:57 -0700)]
svga: document magic number of 8 render targets per batch
Grab the comments from commit message
b84b7f19dfdc0 to explain
what the code is doing.
Brian Paul [Thu, 7 Nov 2013 23:59:40 +0000 (16:59 -0700)]
util: set all unused cbufs to NULL in util_copy_framebuffer_state()
This helps fix an issue in the svga driver, and is just safer all-around.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Mon, 11 Nov 2013 15:12:05 +0000 (08:12 -0700)]
glx: declare glx_screen struct to silence warning
Brian Paul [Fri, 8 Nov 2013 16:00:46 +0000 (09:00 -0700)]
glx: change query_renderer_integer() value param to unsigned
When this function was added, the returned value was signed in some
places, unsigned in others.
v2: also add unsigned in the unit test, per Ian.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
José Fonseca [Fri, 8 Nov 2013 17:55:14 +0000 (17:55 +0000)]
glx: Fix scons build.
Reviewed-by: Brian Paul <brianp@vmware.com>
Samuel Thibault [Sun, 10 Nov 2013 18:32:01 +0000 (19:32 +0100)]
EGL: fix build without libdrm
This fixes building EGL without libdrm support.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Chris Forbes [Sat, 9 Nov 2013 20:15:13 +0000 (09:15 +1300)]
i965: convert brw_lower_offset_array_visitor to ir_rvalue_visitor
Previously, we would bogusly replace the entire statement containing the
ir_texture node with an ir_dereference_variable.
Correct this to just replace the ir_texture node itself as intended.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Chris Forbes [Sat, 9 Nov 2013 09:26:08 +0000 (22:26 +1300)]
glsl: fix missing breaks in equals(ir_texture,..)
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Fri, 8 Nov 2013 18:49:47 +0000 (10:49 -0800)]
i965: Make the driver compile until a proper libdrm can be released.
No depending on unreleased code.
Armin K [Fri, 8 Nov 2013 23:06:45 +0000 (00:06 +0100)]
glx: conditionaly build dri3 and present loader (v3)
This patch makes it possible to disable DRI3 if desired.
Tested with:
./configure --disable-dri3 --with-dri-drivers=i965 \
--with-gallium-drivers= --disable-vdpau --disable-egl \
--disable-gbm --disable-xvmc
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71397
Cc: 10.0 <mesa-stable@lists.freedesktop.org>
Matt Turner [Thu, 7 Nov 2013 23:09:33 +0000 (15:09 -0800)]
i965/fs: Don't perform CSE on inst HW_REG dests (unless it's null)
Commit
b16b3c87 began performing CSE on CMP instructions with null
destinations. I relaxed the restrictions a bit too much, thereby
allowing CSE to be performed on instructions with, for instance, an
explicit accumulator destination.
This broke the arb_gpu_shader5/fs-imulExtended shader tests because
they emit MUL instructions with the accumulator as the destination. CSE
would instead cause the MUL to write to a GRF, which is lower precision
than the accumulator.
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: 10.0 <mesa-stable@lists.freedesktop.org>
Chad Versace [Fri, 8 Nov 2013 19:35:25 +0000 (11:35 -0800)]
i965: Remove some tiny dead code from intel_miptree_map_movntdqa
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Brian Paul [Fri, 8 Nov 2013 15:33:47 +0000 (08:33 -0700)]
swrast: add missing notify_reset parameter to dri_create_context()
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Christian König [Sun, 3 Nov 2013 14:19:00 +0000 (15:19 +0100)]
vl: use a separate context for shader based decode v2
This makes VDPAU thread save again.
v2: fix some memory leaks reported by Aaron Watry.
Signed-off-by: Christian König <christian.koenig@amd.com>
José Fonseca [Fri, 8 Nov 2013 12:22:22 +0000 (12:22 +0000)]
scons: Add dri2_query_renderer.c to sources.
José Fonseca [Fri, 8 Nov 2013 12:20:00 +0000 (12:20 +0000)]
st/dri: Fix dri_create_context declaration prototype.
Keith Packard [Fri, 8 Nov 2013 03:01:48 +0000 (19:01 -0800)]
dri3: Fix pixmap buf_id computation
Looks like some kind of rebase damage to me...
Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Eric Anholt <eric@anholt.net>