Matt Turner [Mon, 29 Dec 2014 18:52:17 +0000 (10:52 -0800)]
mesa: Remove __SSE4_1__ guards from sse_minmax.c.
See commit
e07c9a288.
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Matt Turner [Sun, 21 Dec 2014 02:02:29 +0000 (18:02 -0800)]
i965/vec4: Do separate copy followed by constant propagation after opt_vector_float().
total instructions in shared programs:
5877012 ->
5876617 (-0.01%)
instructions in affected programs: 33140 -> 32745 (-1.19%)
From before the commit that allows VF constant propagation (which hurt
some programs) to here, the results are:
total instructions in shared programs:
5877951 ->
5876617 (-0.02%)
instructions in affected programs: 123444 -> 122110 (-1.08%)
with no programs hurt.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Sun, 21 Dec 2014 01:42:52 +0000 (17:42 -0800)]
i965/vec4: Allow constant propagation of VF immediates.
total instructions in shared programs:
5877951 ->
5877012 (-0.02%)
instructions in affected programs: 155923 -> 154984 (-0.60%)
Helps 1233, hurts 156 shaders. The hurt shaders are addressed in the
next commit.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Sun, 21 Dec 2014 01:37:09 +0000 (17:37 -0800)]
i965/vec4: Add parameter to skip doing constant propagation.
After CSEing some MOV ..., VF instructions we have code like
mov tmp, [1F, 2F, 3F, 4F]VF
mov r10, tmp
mov r11, tmp
...
use r10
use r11
We want to copy propagate tmp into the uses of r10 and r11, but *not*
constant propagate the VF immediate into the uses of tmp.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Sat, 20 Dec 2014 21:17:00 +0000 (13:17 -0800)]
i965/vec4: Do CSE, copy propagation, and DCE after opt_vector_float().
total instructions in shared programs:
5869005 ->
5868220 (-0.01%)
instructions in affected programs: 70208 -> 69423 (-1.12%)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Thu, 3 Apr 2014 21:29:30 +0000 (14:29 -0700)]
i965/vec4: Perform CSE on MOV ..., VF instructions.
Port of commit
a28ad9d4 from the fs backend.
No shader-db changes since we don't emit MOV ..., VF instructions yet.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Sat, 20 Dec 2014 19:50:31 +0000 (11:50 -0800)]
i965/vec4: Add pass to gather constants into a vector-float MOV.
Currently only handles consecutive instructions with the same
destination that collectively write all channels.
total instructions in shared programs:
5879798 ->
5869011 (-0.18%)
instructions in affected programs: 465236 -> 454449 (-2.32%)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Sun, 21 Dec 2014 14:56:54 +0000 (06:56 -0800)]
i965: Add support for saturating immediates.
I don't feel great about assert(!"unimplemented: ...") but these
cases do only seem possible under some currently impossible circumstances.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Sat, 20 Dec 2014 19:47:40 +0000 (11:47 -0800)]
i965: Add fs_reg/src_reg constructors that take vf[4].
Sometimes it's easier to generate 4x values into an array, and the
memcpy is 1 instruction, rather than 11 to piece 4 arguments together.
I'd forgotten to remove the prototype from fs_reg from a previous patch,
so it's already there for us here.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Alexander von Gluck IV [Sat, 27 Dec 2014 06:12:54 +0000 (06:12 +0000)]
gallium/target: Drop no longer needed Haiku viewport override
* Drop no longer needed mesa headers
* Haiku LLVM pipe working with LLVM 3.5.0 on x86_64
Alexander von Gluck IV [Sat, 27 Dec 2014 05:55:23 +0000 (05:55 +0000)]
gallium/st: Clean up Haiku depth mapping, fix colorspace errors
Eric Anholt [Thu, 25 Dec 2014 22:22:02 +0000 (12:22 -1000)]
vc4: Handle unaligned accesses in CL emits.
As of
229bf4475ff0a5dbeb9bc95250f7a40a983c2e28 we started getting SIBGUS
from unaligned accesses on the hardware, for reasons I haven't figured
out. However, we should be avoiding unaligned accesses anyway, and our CL
setup certainly would have produced them.
Eric Anholt [Thu, 25 Dec 2014 22:14:54 +0000 (12:14 -1000)]
vc4: Don't bother zero-initializing the shader reloc indices.
They should all be set to real values by the time they're read, and
ideally if you used valgrind you'd see uninitialized value uses.
Eric Anholt [Thu, 25 Dec 2014 19:35:46 +0000 (09:35 -1000)]
vc4: Fix the argument type for cl_u16().
It doesn't matter, since it just got truncated to 16 inside, anyway.
Alexander von Gluck IV [Wed, 24 Dec 2014 13:44:25 +0000 (07:44 -0600)]
egl: Fix non-dri SCons builds re #87657
* Revert change to egl main producing Shared Libraries
* Check for dri before including dri code
Michel Dänzer [Tue, 9 Dec 2014 08:00:32 +0000 (17:00 +0900)]
radeonsi: Don't modify PA_SC_RASTER_CONFIG register value if rb_mask == 0
E.g. this could happen on older kernels which don't support the
RADEON_INFO_SI_BACKEND_ENABLED_MASK query yet. The code in
si_write_harvested_raster_configs() doesn't deal with this correctly and
would probably mangle the value badly.
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Eric Anholt [Mon, 22 Dec 2014 18:09:10 +0000 (10:09 -0800)]
vc4: Optimize CL emits by doing size checks up front.
The optimizer obviously doesn't have the ability to rewrite these to skip
the size checks per call, so we have to do it manually.
Improves a norast benchmark on simulation by 0.779706% +/- 0.405838%
(n=6087).
Eric Anholt [Sun, 21 Dec 2014 21:10:25 +0000 (13:10 -0800)]
vc4: Avoid repeated hindex lookups in the loop over tiles.
Improves norast performance of a microbenchmark by 11.1865% +/- 2.37673%
(n=20).
Kenneth Graunke [Tue, 23 Dec 2014 02:43:08 +0000 (18:43 -0800)]
i965: Add missing BRW_NEW_*_PROG_DATA to texture/renderbuffer atoms.
This was probably missed when moving from a fixed binding table layout
to a dynamic one that changes based on the shader.
Fixes newly proposed Piglit test fbo-mrt-new-bind.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87619
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Mike Stroyan <mike@LunarG.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Kenneth Graunke [Mon, 22 Dec 2014 08:55:37 +0000 (00:55 -0800)]
i965: Cache register write capability checks.
Our ability to perform register writes depends on the hardware and
kernel version. It shouldn't ever change on a per-context basis,
so we only need to check once.
Checking introduces a synchronization point between the CPU and GPU:
even though we submit very few GPU commands, the GPU might be busy doing
other work, which could cause us to stall for a while.
On an idle i7 4750HQ, this improves performance in OglDrvCtx (a context
creation microbenchmark) by 6.14748% +/- 1.6837% (n=20). With Unigine
Valley running in the background (to keep the GPU busy), it improves
performance in OglDrvCtx by 2290.92% +/- 29.5274% (n=5).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Rob Clark [Sat, 25 Oct 2014 19:47:21 +0000 (15:47 -0400)]
freedreno/ir3: split out legalize pass
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Sat, 25 Oct 2014 18:04:29 +0000 (14:04 -0400)]
freedreno/ir3: ra debug
Some compile time RA debug
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Alexander von Gluck IV [Mon, 22 Dec 2014 16:29:56 +0000 (11:29 -0500)]
egl/haiku: Clean up SConscript whitespace
Alexander von Gluck IV [Mon, 22 Dec 2014 16:27:35 +0000 (11:27 -0500)]
egl/dri2: Fix build of dri2 egl driver with SCons
* egl/dri2 was missing a SConscript
* Problem caught by Adrián Arroyo Calle
Alexander von Gluck IV [Mon, 22 Dec 2014 16:02:50 +0000 (16:02 +0000)]
egl: Clean up Haiku visual creation
* Only create one struct
* 'final' also is a language conflict
* Some style cleanup
Alexander von Gluck IV [Mon, 22 Dec 2014 15:10:13 +0000 (10:10 -0500)]
egl: Add Haiku code and support
* This is the cleaned up work of the Haiku GCI student
Adrián Arroyo Calle adrian.arroyocalle@gmail.com
* Several patches were consolidated to prevent
unnecessary touching of non-related code
Timothy Arceri [Tue, 25 Nov 2014 12:04:23 +0000 (23:04 +1100)]
glsl: check if implicitly sized arrays match explicitly sized arrays across the same stage
V2: Improve error message.
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Chad Versace [Wed, 19 Nov 2014 05:11:26 +0000 (21:11 -0800)]
i965: Use safer pointer arithmetic in gather_oa_results()
This patch reduces the likelihood of pointer arithmetic overflow bugs in
gather_oa_results(), like the one fixed by
b69c7c5dac.
I haven't yet encountered any overflow bugs in the wild along this
patch's codepath. But I get nervous when I see code patterns like this:
(void*) + (int) * (int)
I smell 32-bit overflow all over this code.
This patch retypes 'snapshot_size' to 'ptrdiff_t', which should fix any
potential overflow.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Chad Versace [Wed, 19 Nov 2014 05:11:25 +0000 (21:11 -0800)]
i965: Use safer pointer arithmetic in intel_texsubimage_tiled_memcpy()
This patch reduces the likelihood of pointer arithmetic overflow bugs in
intel_texsubimage_tiled_memcpy() , like the one fixed by
b69c7c5dac.
I haven't yet encountered any overflow bugs in the wild along this
patch's codepath. But I recently solved, in commit
b69c7c5dac, an overflow
bug in a line of code that looks very similar to pointer arithmetic in
this function.
This patch conceptually applies the same fix as in
b69c7c5dac. Instead
of retyping the variables, though, this patch adds some casts. (I tried
to retype the variables as ptrdiff_t, but it quickly got very messy. The
casts are cleaner).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Chad Versace [Wed, 19 Nov 2014 05:11:24 +0000 (21:11 -0800)]
i965: Fix intel_miptree_map() signature to be more 64-bit safe
This patch should diminish the likelihood of pointer arithmetic overflow
bugs, like the one fixed by
b69c7c5dac.
Change the type of parameter 'out_stride' from int to ptrdiff_t. The
logic is that if you call intel_miptree_map() and use the value of
'out_stride', then you must be doing pointer arithmetic on 'out_ptr'.
Using ptrdiff_t instead of int should make a little bit harder to hit
overflow bugs.
As a side-effect, some function-scope variables needed to be retyped to
avoid compilation errors.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Chad Versace [Wed, 19 Nov 2014 05:11:23 +0000 (21:11 -0800)]
i965: Remove spurious casts in copy_image_with_memcpy()
If a pointer points to raw, untyped memory and is never dereferenced,
then declare it as 'void*' instead of casting it to 'void*'.
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Marek Olšák [Wed, 10 Dec 2014 20:08:50 +0000 (21:08 +0100)]
radeonsi: force NaNs to 0
This fixes incorrect rendering in Unreal Engine demos.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83510
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
David Heidelberg [Fri, 19 Dec 2014 13:13:15 +0000 (14:13 +0100)]
st/nine: fix DBG typo (trivial)
Signed-off-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
David Heidelberg [Fri, 19 Dec 2014 13:11:21 +0000 (14:11 +0100)]
r300g: implement ARR opcode
Same as ARL, just has extra rounding.
Useful for st/nine.
Tested-by: Pavel Ondračka <pavel.ondracka@email.cz>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Rob Clark [Sat, 20 Dec 2014 17:04:05 +0000 (12:04 -0500)]
freedreno/a4xx: blend-color
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Sat, 20 Dec 2014 17:01:02 +0000 (12:01 -0500)]
freedreno/a4xx: alpha-test
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Sat, 20 Dec 2014 16:49:34 +0000 (11:49 -0500)]
freedreno: update generated headers
Rob Clark [Sat, 20 Dec 2014 16:46:43 +0000 (11:46 -0500)]
freedreno/ir3: trans_kill cleanup
trans_kill() only handles the single opcode. Drop the remnant of a time
when both KILL and KILL_IF were handled by the same fxn.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Sat, 20 Dec 2014 16:44:28 +0000 (11:44 -0500)]
freedreno/ir3: hack for standalone compiler
Standalone compiler doesn't have screen or context. We need to come up
with a better way to control the target arch (ie. something that we can
control from cmdline w/ standalone compiler) but for now this hack keeps
it from segfault'ing.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Matt Turner [Fri, 19 Dec 2014 20:55:13 +0000 (12:55 -0800)]
i965/fs: Add missing const qualifier.
Eric Anholt [Thu, 18 Dec 2014 04:35:17 +0000 (20:35 -0800)]
vc4: Coalesce MOVs into VPM with the instructions generating the values.
total instructions in shared programs: 41168 -> 40976 (-0.47%)
instructions in affected programs: 18156 -> 17964 (-1.06%)
Eric Anholt [Thu, 18 Dec 2014 04:23:57 +0000 (20:23 -0800)]
vc4: Redefine VPM writes as a (destination) QIR register file.
This will let me coalesce the VPM writes into the instructions generating
the values.
Timothy Arceri [Wed, 17 Dec 2014 20:46:24 +0000 (07:46 +1100)]
docs: note change in minimum GCC version to 4.2.0
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Timothy Arceri [Wed, 17 Dec 2014 20:45:04 +0000 (07:45 +1100)]
gallium: remove support for GCC older than 4.2.0
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Timothy Arceri [Wed, 17 Dec 2014 20:29:47 +0000 (07:29 +1100)]
mesa: bump required GCC version to 4.2.0
It turns out Mesa hasn't compiled on less then 4.2 for a while
so update conf to reflect this.
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Eric Anholt [Wed, 10 Dec 2014 22:56:46 +0000 (14:56 -0800)]
vc4: Add support for turning constant uniforms into small immediates.
Small immediates have the downside of taking over the raddr B field, so
you might have less chance to pack instructions together thanks to raddr B
conflicts. However, it also reduces some register pressure since it lets
you load 2 "uniform" values in one instruction (avoiding a previous load
of the constant value to a register), and increases some pairing for the
same reason.
total uniforms in shared programs: 16231 -> 13374 (-17.60%)
uniforms in affected programs: 10280 -> 7423 (-27.79%)
total instructions in shared programs: 40795 -> 41168 (0.91%)
instructions in affected programs: 25551 -> 25924 (1.46%)
In a previous version of this patch I had a reduction in instruction count
by forcing the other args alongside a SMALL_IMM to be in the A file or
accumulators, but that increases register pressure and had a bug in
handling FRAG_Z. In this patch is I just use raddr conflict resolution,
which is more expensive. I think I'd rather tweak allocation to have some
way to slightly prefer good choices for files in general, rather than risk
failing to register allocate by forcing things into register classes.
Eric Anholt [Wed, 10 Dec 2014 23:37:07 +0000 (15:37 -0800)]
vc4: Move follow_movs() to common QIR code.
I want this from other passes.
Eric Anholt [Thu, 11 Dec 2014 00:20:36 +0000 (16:20 -0800)]
vc4: Fix missing newline for load immediate instruction disasm.
Matt Turner [Wed, 17 Dec 2014 21:19:37 +0000 (13:19 -0800)]
mesa: Remove unnecessary -f from $(RM).
$(RM) includes -f.
Matt Turner [Mon, 8 Dec 2014 02:09:49 +0000 (18:09 -0800)]
mesa: Remove tarballs/checksum rules.
Matt Turner [Wed, 17 Dec 2014 21:37:38 +0000 (13:37 -0800)]
gallium: Add egl and gbm to distribution.
Matt Turner [Wed, 17 Dec 2014 20:41:02 +0000 (12:41 -0800)]
mesa: Set DISTCHECK_CONFIGURE_FLAGS.
Enable some non-default options that distros are likely to use.
Matt Turner [Wed, 17 Dec 2014 20:40:43 +0000 (12:40 -0800)]
targets/xvmc: Add uninstall hooks to handle megadriver hardlinks.
Matt Turner [Wed, 17 Dec 2014 20:40:30 +0000 (12:40 -0800)]
targets/vdpau: Add uninstall hooks to handle megadriver hardlinks.
Matt Turner [Wed, 17 Dec 2014 20:39:59 +0000 (12:39 -0800)]
targets/vdpau: Add clean-local rule to remove .lib links.
Eric Anholt [Sat, 13 Dec 2014 23:27:39 +0000 (15:27 -0800)]
vc4: Add a userspace BO cache.
Since our kernel BOs require CMA allocation, and the use of them requires
new mmaps, it's pretty expensive and we should avoid it if possible.
Copying my original design for Intel, make a userspace cache that reuses
BOs that haven't been shared to other processes but frees BOs that have
sat in the cache for over a second.
Improves glxgears framerate on RPi by around 30%.
Eric Anholt [Tue, 14 Oct 2014 10:21:04 +0000 (11:21 +0100)]
vc4: Add dmabuf support.
This gets DRI3 working on modesetting with glamor. It's not enabled under
simulation, because it looks like handing our dumb-allocated buffers off
to the server doesn't actually work for the server's rendering.
Eric Anholt [Wed, 17 Dec 2014 00:11:27 +0000 (16:11 -0800)]
vc4: Drop a weird argument in the BOs-from-handles API.
Roland Scheidegger [Wed, 17 Dec 2014 19:16:07 +0000 (20:16 +0100)]
draw: revert using correct order for prim decomposition.
This reverts
db3dfcfe90a3d27e6020e0d3642f8ab0330e57be.
The commit was correct but we've got some precision problems later in
llvmpipe (or possibly in draw clip) due to the vertices coming in in
different order, causing some internal test failures. So revert for now.
(Will only affect drivers which actually support constant-interpolated
attributes and not just flatshading.)
Jan Vesely [Thu, 11 Dec 2014 20:05:17 +0000 (15:05 -0500)]
util: Silence signed-unsigned comparison warnings
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Cody Northrop [Mon, 15 Sep 2014 22:14:20 +0000 (16:14 -0600)]
i965: Require pixel alignment for GPU copy blit
The blitter will start at a pixel's natural alignment. For PBOs, if the
provided offset if not aligned, bits will get dropped.
This change adds offset alignment check for src and dst, kicking back if
the requirements are not met.
The change is based on following verbiage from BSPEC:
Color pixel sizes supported are 8, 16, and 32 bits per pixel (bpp).
All pixels are naturally aligned.
Found in the following locations:
page 35 of intel-gfx-prm-osrc-hsw-blitter.pdf
page 29 of ivb_ihd_os_vol1_part4.pdf
page 29 of snb_ihd_os_vol1_part5.pdf
This behavior was observed with Steam Big Picture rendering incorrect
icon colors. The fix has been tested on Ubuntu and SteamOS on Haswell.
Signed-off-by: Cody Northrop <cody@lunarg.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83908
Reviewed-by: Neil Roberts <neil@linux.intel.com>
Mark Janes [Tue, 16 Dec 2014 22:29:28 +0000 (14:29 -0800)]
i965: remove includes of sampler.h from extern "C" blocks
C linkage was removed from functions in program/sampler.cpp. However,
some cpp files include program/sampler.h within extern "C" blocks,
causing link errors for test_vec4_copy_propagation.
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Kenneth Graunke [Thu, 11 Dec 2014 10:26:39 +0000 (02:26 -0800)]
i965/query: Cache whether the batch references the query BO.
Chris Wilson noted that repeated calls to CheckQuery() would call
drm_intel_bo_references(brw->batch.bo, query->bo) on each invocation,
which is expensive. Once we've flushed, we know that future batches
won't reference query->bo, so there's no point in asking more than once.
This patch adds a brw_query_object::flushed flag, which is a
conservative estimate of whether the batch has been flushed.
On the first call to CheckQuery() or WaitQuery(), we check if the
batch references query->bo. If not, it must have been flushed for
some reason (such as being full). We record that it was flushed.
If it does reference query->bo, we explicitly flush, and record that
we did so.
Any subsequent checks will simply see that query->flushed is set,
and skip the drm_intel_bo_references() call.
Inspired by a patch from Chris Wilson.
According to Eero, this does not affect the performance of Witcher 2
on Haswell, but approximately halves the userspace CPU usage.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86969
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Kenneth Graunke [Thu, 11 Dec 2014 09:57:39 +0000 (01:57 -0800)]
i965/query: Use brw_bo_map to handle stall warnings.
This is less code and also measures the duration of the stall for us.
Our old code predates the existance of brw_bo_map().
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Kenneth Graunke [Thu, 11 Dec 2014 09:43:52 +0000 (01:43 -0800)]
i965/query: Remove redundant drm_intel_bo_references call in CheckQuery.
CheckQuery calls drm_intel_bo_references to see if the batch references
the query BO, and if so, flushes. It then checks if the query BO is
busy, and if not, calls gen6_queryobj_get_results().
Stupidly, gen6_queryobj_get_results() immediately did a second redundant
drm_intel_bo_references check, even though we know the buffer is not
referenced and in fact idle.
This patch moves the batch-flush check out of gen6_queryobj_get_results
and into WaitQuery() (the other caller). That way, both callers do a
single batch-flush check.
This should only be a minor improvement, since it would only affect
the first CheckQuery call where the result is actually available.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86969
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Kenneth Graunke [Thu, 11 Dec 2014 09:40:28 +0000 (01:40 -0800)]
i965/query: Add query->bo == NULL early return in CheckQuery hook.
If query->bo == NULL, this is a redundant CheckQuery call, and we
should simply return. We didn't do anything anyway - we skipped the
batch flushing block, and although we called get_results(), it has an
early return and does nothing. Why bother?
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Kenneth Graunke [Thu, 11 Dec 2014 09:29:48 +0000 (01:29 -0800)]
i965/query: Set Ready flag in gen6_queryobj_get_results().
q->Ready means that the results are in, and core Mesa is free to return
them to the application. gen6_queryobj_get_results() is a natural place
to set that flag; doing so means callers don't have to.
The older non-hardware-context aware code couldn't do this, because we
had to call brw_queryobj_get_results() to gather intermediate results
when we ran out of space for snapshots in the query buffer. We only
gather complete results in the Gen6+ code, however.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Eric Anholt [Tue, 16 Dec 2014 19:58:58 +0000 (11:58 -0800)]
vc4: Add support for turning add-based MOVs to muls for pairing.
total instructions in shared programs: 43053 -> 40795 (-5.24%)
instructions in affected programs: 37996 -> 35738 (-5.94%)
Eric Anholt [Tue, 16 Dec 2014 19:22:53 +0000 (11:22 -0800)]
vc4: Add a helper for changing a field in an instruction.
Eric Anholt [Tue, 16 Dec 2014 19:29:15 +0000 (11:29 -0800)]
vc4: Fix the name of qpu_waddr_ignores_ws().
We're deciding about the WS bit, not PM.
Timothy Arceri [Sat, 13 Dec 2014 10:22:39 +0000 (21:22 +1100)]
docs: note change in minimum GCC version to 4.1.0
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Timothy Arceri [Fri, 12 Dec 2014 09:47:43 +0000 (20:47 +1100)]
util: remove support for GCC older than 4.1.0
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-By: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Timothy Arceri [Fri, 12 Dec 2014 09:44:26 +0000 (20:44 +1100)]
mesa: remove support for GCC older than 4.1.0
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-By: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Timothy Arceri [Fri, 12 Dec 2014 09:38:22 +0000 (20:38 +1100)]
gbm: remove support for GCC older than 4.1.0
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-By: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Timothy Arceri [Fri, 12 Dec 2014 09:36:25 +0000 (20:36 +1100)]
gallium: remove support for GCC older than 4.1.0
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-By: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Timothy Arceri [Fri, 12 Dec 2014 09:27:44 +0000 (20:27 +1100)]
egl: remove support for GCC older than 4.1.0
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-By: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Timothy Arceri [Fri, 12 Dec 2014 08:12:48 +0000 (19:12 +1100)]
mesa: bump required GCC version to 4.1.0
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-By: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Timothy Arceri [Fri, 12 Dec 2014 08:05:22 +0000 (19:05 +1100)]
mesa: remove support for GCC older than 3.3.0
GCC >=3.3 has been required since
9aa3aa71386394725ce88df463d6183f62777ee5
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-By: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Matt Turner [Tue, 16 Dec 2014 19:30:12 +0000 (11:30 -0800)]
i965/fs: Add a comment explaining what saturate propagation does.
Eric Anholt [Mon, 15 Dec 2014 23:11:07 +0000 (15:11 -0800)]
vc4: Add support for enabling early Z discards.
This is the same basic logic from the original Broadcom driver.
Brian Paul [Tue, 16 Dec 2014 00:05:53 +0000 (17:05 -0700)]
st/mesa: remove extern "C" around #includes in st_glsl_to_tgsi.cpp
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Brian Paul [Tue, 16 Dec 2014 00:05:13 +0000 (17:05 -0700)]
program: remove extern "C" usage in sampler.cpp
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Brian Paul [Tue, 16 Dec 2014 00:04:38 +0000 (17:04 -0700)]
program: remove extern "C" around #includes
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Brian Paul [Mon, 15 Dec 2014 23:41:58 +0000 (16:41 -0700)]
glsl: remove extern "C" around #includes
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Brian Paul [Mon, 15 Dec 2014 23:49:54 +0000 (16:49 -0700)]
st/mesa: add extern "C" to st_context.h
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Brian Paul [Mon, 15 Dec 2014 23:46:46 +0000 (16:46 -0700)]
st/mesa: add extern "C" to st_program.h
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Brian Paul [Mon, 15 Dec 2014 23:45:57 +0000 (16:45 -0700)]
main: remove extern C around #includes in ff_fragment_shader.cpp
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Brian Paul [Mon, 15 Dec 2014 23:41:29 +0000 (16:41 -0700)]
mesa: move #include of mtypes.h outside __cplusplus check
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Brian Paul [Tue, 16 Dec 2014 00:01:22 +0000 (17:01 -0700)]
program: add #ifndef SAMPLER_H wrapper
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Brian Paul [Mon, 15 Dec 2014 23:37:35 +0000 (16:37 -0700)]
mesa: put extern "C" in src/mesa/program/*h header files
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Brian Paul [Mon, 15 Dec 2014 23:36:27 +0000 (16:36 -0700)]
mesa: put extern "C" in header files
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Juha-Pekka Heikkila [Tue, 16 Dec 2014 10:28:49 +0000 (12:28 +0200)]
mapi: add glapi-test and shared-glapi-test to .gitignore
On the same go remove src/mapi/shared-glapi/tests/.gitignore
and src/mapi/glapi/tests/.gitignore as useless.
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Juha-Pekka Heikkila [Tue, 16 Dec 2014 10:28:48 +0000 (12:28 +0200)]
util: add u_atomic_test to .gitignore
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Juha-Pekka Heikkila [Tue, 16 Dec 2014 10:28:47 +0000 (12:28 +0200)]
glx: remove __glXstrdup()
I didn't find this being used anywhere
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Juha-Pekka Heikkila [Tue, 16 Dec 2014 10:28:46 +0000 (12:28 +0200)]
i965: add test_vf_float_conversions to .gitignore
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Juha-Pekka Heikkila [Tue, 16 Dec 2014 10:28:45 +0000 (12:28 +0200)]
i965: Make validate_reg tables constant
Declare local tables constant.
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Timothy Arceri [Mon, 15 Dec 2014 07:15:34 +0000 (18:15 +1100)]
glsl: remove commented out code
MaxGeometryOutputComponents is used as the value
for gl_MaxGeometryVaryingComponents
Acked-by: Matt Turner <mattst88@gmail.com>
Timothy Arceri [Mon, 15 Dec 2014 07:10:12 +0000 (18:10 +1100)]
i965: remove commented out code
Acked-by: Matt Turner <mattst88@gmail.com>
Ilia Mirkin [Tue, 16 Dec 2014 04:17:28 +0000 (23:17 -0500)]
nvc0: add missed PIPE_CAP_VERTEXID_NOBASE
Commit
ade8b26bf missed adding this cap to nvc0.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Roland Scheidegger [Fri, 12 Dec 2014 03:14:02 +0000 (04:14 +0100)]
st/mesa: use vertex id lowering according to pipe cap bit.
Tested with llvmpipe by setting the cap bit temporarily, seems to work,
though no driver requests it for now.