Christoph Bumiller [Mon, 25 Feb 2013 11:52:43 +0000 (12:52 +0100)]
nv50/ir: initialize CodeEmitters' specialized target fields
Christoph Bumiller [Wed, 20 Feb 2013 20:03:30 +0000 (21:03 +0100)]
nv50/ir/opt: make optimization aware of atomics, barriers, surface ops
Christoph Bumiller [Fri, 22 Feb 2013 17:45:16 +0000 (18:45 +0100)]
nv50/ir: add various new OPs that will be needed for compute
Francisco Jerez [Fri, 18 May 2012 14:17:44 +0000 (16:17 +0200)]
nv50/ir: Rename "mkLoad" to "mkLoadv" for consistency.
Christoph Bumiller [Sun, 24 Feb 2013 17:36:21 +0000 (18:36 +0100)]
nv50/ir: fix comparison of system values
Francisco Jerez [Tue, 6 Mar 2012 19:18:12 +0000 (20:18 +0100)]
nv50/ir/tgsi: Translate grid-related system parameters.
Francisco Jerez [Mon, 14 Nov 2011 23:12:20 +0000 (00:12 +0100)]
nv50/ir/tgsi: Accept COMPUTE programs.
Christoph Bumiller [Wed, 27 Feb 2013 20:08:57 +0000 (21:08 +0100)]
nv50/ir/ra: make sure all used function inputs get assigned a reg
A live range [0, 0) counts as empty. For function inputs this can
be a problem, so insert a nop at the beginning to make it [0, 1).
This is a bit of a hack but also the most simple solution.
Christoph Bumiller [Mon, 25 Feb 2013 13:45:52 +0000 (14:45 +0100)]
nv50/ir/ra: also add pre-existing MERGE,SPLIT to constraint list
Christoph Bumiller [Wed, 6 Feb 2013 16:14:55 +0000 (17:14 +0100)]
nv50/ir/ra: fix confusion with conditional RegisterSet::occupy
Christoph Bumiller [Thu, 28 Feb 2013 22:41:41 +0000 (23:41 +0100)]
nv50/ir/ra: swap copyCompound args if src is compound and dst isn't
Francisco Jerez [Mon, 30 Apr 2012 13:22:27 +0000 (15:22 +0200)]
nv50/ir/ra: Fix maxGPR calculation for programs with multiple functions.
Francisco Jerez [Mon, 30 Apr 2012 13:19:40 +0000 (15:19 +0200)]
nv50/ir/ra: Fix traversal before the beginning of the active list in buildRIG.
Francisco Jerez [Mon, 30 Apr 2012 13:13:07 +0000 (15:13 +0200)]
nv50/ir/ra: Fix RegisterSet::occupy(const Value *v).
Francisco Jerez [Mon, 30 Apr 2012 13:12:15 +0000 (15:12 +0200)]
nv50/ir/ra: Fix argument const-ness in RegisterSet::idToUnits and idToBytes
Francisco Jerez [Wed, 6 Feb 2013 13:12:44 +0000 (14:12 +0100)]
nv50/ir/opt: Fix tryPropagateBranch for BBs with several exit branches.
Comments and "if (bf->cfg.incidentCount() == 1)" condition added
by Christoph Bumiller.
Francisco Jerez [Mon, 30 Apr 2012 13:06:52 +0000 (15:06 +0200)]
nv50/ir: Clean up references to function values before destroying them.
Francisco Jerez [Wed, 25 Apr 2012 21:48:47 +0000 (23:48 +0200)]
nouveau: Bail out from nouveau_fence_wait if flushing the pushbuf fails.
Vinson Lee [Mon, 11 Mar 2013 05:51:23 +0000 (22:51 -0700)]
mesa: Use correct functions for enum conversion.
Fixes mixing enum types defects reported by Coverity.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Rob Clark [Sat, 27 Oct 2012 16:07:34 +0000 (11:07 -0500)]
freedreno: gallium driver for adreno
Currently works on a220. Others in the a2xx family look pretty similar
and should be pretty straightforward to support with the same driver.
The a3xx has a new shader ISA, and while many registers appear similar,
the register addresses have been completely shuffled around. I am not
sure yet whether it is best to support with the same driver, but
different compiler, or whether it should be split into a different
driver.
v1: original
v2: build file updates from review comments, and remove GPL licensed
header files from msm kernel
v3: smarter temp/pred register assignment, fix clear and depth/stencil
format issues, resource_transfer fixes, scissor fixes
Signed-off-by: Rob Clark <robdclark@gmail.com>
José Fonseca [Mon, 11 Mar 2013 10:13:47 +0000 (10:13 +0000)]
d3d1x: Remove.
Unused/unmaintained.
Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
José Fonseca [Mon, 11 Mar 2013 10:14:19 +0000 (10:14 +0000)]
nv50: Remove nv0_ir_from_sm4.*
Unused, depends on d3d1x.
Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Roland Scheidegger [Sat, 9 Mar 2013 00:46:33 +0000 (01:46 +0100)]
gallivm: clean up passing derivatives around
Previously, the derivatives were calculated and passed in a packed form
to the sample code (for implicit derivatives, explicit derivatives were
packed to the same format).
There's several reasons why this wasn't such a good idea:
1) the derivatives may not even be needed (not as bad as it sounds since
llvm will just throw the calculations needed for them away but still)
2) the special packing format really shouldn't be part of the sampler
interface
3) depending what the sample code actually does the derivatives will
be processed differently, hence there is no "ideal" packing. For cube
maps with explicit derivatives (which we don't do yet) for instance the
packing looked downright useless, and for non-isotropic filtering we'd
need different calculations too.
So, instead just pass the derivatives as is (for explicit derivatives),
or let the rho calculating sample code calculate them itself. This still
does exactly the same packing stuff for implicit derivatives for now,
though explicit ones are handled in a more straightforward manner (quick
estimates show performance should be quite similar, though it is much
easier to follow and also does the rho calculation per-pixel until the
end, which we eventually need for spec compliance anyway).
No piglit changes.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Chad Versace [Thu, 21 Feb 2013 03:59:07 +0000 (19:59 -0800)]
i965: Fix typo in doxygen hyperlink
s/brw_state_upload/brw_upload_state/
Found because the link was broken.
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Eric Anholt [Wed, 20 Feb 2013 01:01:41 +0000 (17:01 -0800)]
mesa: Reduce memory usage for reg alloc with many graph nodes (part 2).
After the previous fix that almost removes an allocation of 4*n^2
bytes, we can use a bitset to reduce another allocation from n^2 bytes
to n^2/8 bytes.
Between the previous commit and this one, the peak heap size for an
oglconform ARB_fragment_program max instructions test on i965 goes from
4GB to 255MB.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55825
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Wed, 20 Feb 2013 00:46:41 +0000 (16:46 -0800)]
mesa: Reduce the memory usage for reg alloc with many graph nodes (part 1)
We were allocating an adjacency_list entry for every possible
interference that could get created, but that usually doesn't happen.
We can save a lot of memory by resizing the array on demand.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Wed, 20 Feb 2013 00:20:10 +0000 (16:20 -0800)]
i965/fs: Improve CSE performance by expiring some available expressions.
We're already walking the list, and we can easily know when something
has no reason to be in the list any longer, so take a brief extra step
to reduce our worst-case runtime (an oglconform test that emits the
maximum instructions in a fragment program). I don't actually know what
the worst-case runtime was, because it was too long and I got bored.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Tue, 19 Feb 2013 22:36:06 +0000 (14:36 -0800)]
i965/fs: Improve live variables calculation performance.
We can execute way fewer instructions by doing our boolean manipulation
on an "int" of bits at a time, while also reducing our working set size.
Reduces compile time of L4D2's slowest shader from 4s to 1.1s
(-72.4% +/- 0.2%, n=10)
v2: Remove redundant masking (noted by Ken)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Thu, 7 Mar 2013 01:50:50 +0000 (17:50 -0800)]
i965/fs: Also do the gen4 SEND dependency workaround against other SENDs.
We were handling the the dependency workaround for the first written reg
of a send preceding the one we're fixing up, but didn't consider the other
regs. Thus if you had two sampler calls that got allocated to the same
set of regs, one might, rarely, ovewrite the other. This was occurring in
XBMC's GLSL shaders.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44567
NOTE: This is a candidate for the stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Wed, 6 Mar 2013 22:47:22 +0000 (14:47 -0800)]
i965/fs: Switch to using sampler LD messages for uniform pull constants.
When forcing the compiler to always generate pull constants instead of
push constants (in order to have an easy to use testcase), improves
performance of my old GLSL demo 23.3553% +/- 1.42968% (n=7).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60866
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Wed, 6 Mar 2013 23:58:46 +0000 (15:58 -0800)]
i965/fs: Fix broken rendering in large shaders with UBO loads.
The lowering process creates a new vgrf on gen7 that should be represented
in live interval analysis. As-is, it was getting a conflicting allocation
with gl_FragDepth in the dolphin emulator, producing broken rendering.
NOTE: This is a candidate for the 9.1 branch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61317
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Thu, 7 Mar 2013 01:12:28 +0000 (17:12 -0800)]
i965/fs: Add a comment about about an implementation detail.
I was going to fix the code above like the previous commit, but we already
had that covered (otherwise all our uniform access would have been broken,
unlike just pull constants).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Thu, 7 Mar 2013 00:38:10 +0000 (16:38 -0800)]
i965/fs: Fix register allocation for uniform pull constants in 16-wide.
We were allowing a compressed instruction to write a register that
contained the last use of a uniform pull constant (either UBO load or push
constant spillover), so it would get half its values smashed.
Since we need to see the actual instruction to decide this, move the
pre-gen6 pixel_x/y logic here, which should improve the performance of
register allocation since virtual_grf_interferes() is called more than
once per instruction.
NOTE: This is a candidate for the stable branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61317
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Wed, 6 Mar 2013 00:24:07 +0000 (16:24 -0800)]
intel: Remove some unused debug flags.
I was looking at the list to see what might be interesting to document for
application developers, and it turns out some are completely dead.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Zack Rusin [Fri, 8 Mar 2013 03:15:03 +0000 (19:15 -0800)]
draw/gs: Correctly iterate the emitted primitives
We were assuming that each emitted primitive had the same
number of vertices. That is incorrect. Emitted primitives
can have arbirtrary number of vertices. Simply increment
index on iteration to fix it.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Zack Rusin [Fri, 8 Mar 2013 03:11:28 +0000 (19:11 -0800)]
tgsi/exec: Correctly reset NumOutputs before parsing the shader
Whenever we're binding the shaders we're incrementing NumOutputs,
assuming the parser spots an output decleration, but we were never
reseting the variable. That means that each subsequent bind of
a geometry shader would add its number of output to the number
of output bound by all previously ran shaders and our indexes
would get completely messed up.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Roland Scheidegger [Mon, 11 Mar 2013 16:03:55 +0000 (17:03 +0100)]
draw/llvm: another quick hack for drawing with no position output
Also need to skip things if we have no cv value but pos value
(happens with geometry shaders enabled).
Needs a round of cleanup, though.
Roland Scheidegger [Fri, 8 Mar 2013 21:29:34 +0000 (22:29 +0100)]
softpipe: don't use samplers with prebaked sampler and sampler_view state
This is needed for handling the dx10-style sample opcodes.
This also simplifies the logic by getting rid of sampler variants
completely (sampler_views though OTOH have sort of variants because
some of their state is different depending on the shader stage they
are bound to).
No significant performance difference (openarena run:
840 frames in 459.8 seconds vs. 840 frames in 460.5 seconds).
v2: fix reference counting bug spotted by Jose.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Roland Scheidegger [Fri, 8 Mar 2013 21:10:21 +0000 (22:10 +0100)]
tgsi: emit code for SVIEWINFO and SAMPLE_I
Can handle them since the single sampler interface was introduced.
v2: simplify txf/sample_i handling a bit according to Brian's feedback.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Roland Scheidegger [Fri, 8 Mar 2013 18:45:52 +0000 (19:45 +0100)]
tgsi: fix wrong reg used for unit for TGSI_OPCODE_TXF
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Tom Stellard [Mon, 11 Mar 2013 15:10:51 +0000 (11:10 -0400)]
r600g/llvm: Fix build
Marek Olšák [Tue, 5 Mar 2013 00:15:45 +0000 (01:15 +0100)]
r600g: add debug options disabling various copy-buffer-related features
This will be invaluable for debugging and bug reports.
Marek Olšák [Mon, 4 Mar 2013 12:26:51 +0000 (13:26 +0100)]
mesa: don't allocate a texture if width or height is 0 in CopyTexImage
NOTE: This is a candidate for the stable branches.
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Sun, 3 Mar 2013 16:33:11 +0000 (17:33 +0100)]
gallium/util: attempt to fix blitting multisample texture arrays
We don't have a test for this yet, but obviously the swizzle was wrong.
Marek Olšák [Sun, 3 Mar 2013 13:54:31 +0000 (14:54 +0100)]
r600g: allocate FMASK right after the texture, so that it's aligned with it
This avoids the kernel CS checker errors with MSAA textures.
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Marek Olšák [Sun, 3 Mar 2013 13:33:00 +0000 (14:33 +0100)]
r600g: remove r600.h, move the stuff elsewhere (mostly to r600_pipe.h)
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Marek Olšák [Sun, 3 Mar 2013 13:21:34 +0000 (14:21 +0100)]
r600g: remove r600_hw_context_priv.h, move the stuff to r600_pipe.h
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Marek Olšák [Sat, 2 Mar 2013 16:36:05 +0000 (17:36 +0100)]
r600g: remove deprecated state management code
It's nice to see so much code that did pretty much nothing go away.
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Marek Olšák [Sat, 2 Mar 2013 16:14:51 +0000 (17:14 +0100)]
r600g: atomize pixel shader
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Marek Olšák [Thu, 28 Feb 2013 16:27:36 +0000 (17:27 +0100)]
r600g: atomize vertex shader
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Marek Olšák [Fri, 1 Mar 2013 17:42:52 +0000 (18:42 +0100)]
r600g: inline r600_pipe_shader function
also change names of other functions, so that they make sense
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Marek Olšák [Fri, 1 Mar 2013 15:58:03 +0000 (16:58 +0100)]
r600g: dump vertex elements state along with the fetch shader
Marek Olšák [Fri, 1 Mar 2013 15:57:27 +0000 (16:57 +0100)]
gallium/util: dump instance_divisor
Marek Olšák [Fri, 1 Mar 2013 16:13:18 +0000 (17:13 +0100)]
r600g: remove bytecode dumping
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Marek Olšák [Fri, 1 Mar 2013 15:31:49 +0000 (16:31 +0100)]
r600g: use a single env var R600_DEBUG, disable bytecode dumping
Only the disassembler is used to dump shaders. Here's a few examples
how to use R600_DEBUG.
Log compute info:
R600_DEBUG=compute
Dump all shaders:
R600_DEBUG=fs,vs,gs,ps,cs
Dump pixel shaders only:
R600_DEBUG=ps
Disable Hyper-Z:
R600_DEBUG=nohyperz
Disable the LLVM backend:
R600_DEBUG=nollvm
Or use any combination of the above, or print all options:
R600_DEBUG=help
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Marek Olšák [Fri, 1 Mar 2013 14:32:46 +0000 (15:32 +0100)]
r600g: cleanup #include recursion between r600_pipe.h and evergreen_compute.h
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Marek Olšák [Fri, 1 Mar 2013 14:30:39 +0000 (15:30 +0100)]
r600g: don't check for R600_ENABLE_S3TC env var
Stefan Brüns [Sat, 9 Mar 2013 20:55:50 +0000 (21:55 +0100)]
glapi/gen: Remove duplicate PYTHON_FLAGS
PYTHON_GEN calls python with PYTHON_FLAGS
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
Frank Henigman [Fri, 1 Mar 2013 02:21:51 +0000 (21:21 -0500)]
i965: Link i965_dri.so with C++ linker.
Force C++ linking of i965_dri.so by adding a dummy C++ source file.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Maxence Le Doré [Thu, 7 Mar 2013 01:30:03 +0000 (02:30 +0100)]
gallium/util: Correct shift value for TSC feature detection.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Matt Turner [Tue, 5 Mar 2013 18:25:55 +0000 (10:25 -0800)]
configure.ac: Build dricommon for DRI gallium drivers
Commit
67ef7559 added an || test "x$enable_dri" check in an attempt to
get the DRI common bits built in some necessary cases. That change was
inappropriate as it made these common DRI pieces be built
unconditionally, so some builds were broken.
Subsequently, commit
998d975e3 change the "|| test" to a "-a"
conjunction within the existing test invocation. This made the '-a
"x$enable_dri" = xyes' clause have no effect, (as it was inside an
enclosing test for the same condition). So the new breakage from
commit
67ef7559 was addressed, but the original problems were
regressed.
The immediately preceding commit removed the redundant condition.
Now, finally this commit fixes the original problem as described in
the commit message of
67ef7559: this code should be compiled when
using the DRI state tracker. In order to do so, the HAVE_*_DRI
conditionals must be moved after the last assignment of HAVE_COMMON_DRI.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61821
Tested-by: Stéphane Marchesin <marcheu@chromium.org>
Matt Turner [Tue, 5 Mar 2013 18:27:22 +0000 (10:27 -0800)]
configure.ac: Remove redundant checks of enable_dri.
The whole block is enclosed inside if test "x$enable_dri" = xyes.
Matt Turner [Mon, 4 Mar 2013 19:32:32 +0000 (11:32 -0800)]
mesa: Allow ETC2/EAC formats with ARB_ES3_compatibility.
Fixes piglit's oes_compressed_etc2_texture-miptree tests on Desktop GL.
Reported-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Stéphane Marchesin [Fri, 8 Mar 2013 21:32:55 +0000 (13:32 -0800)]
i915g: Use PIPE_FLUSH_END_OF_FRAME to trigger throttling
This helps with jittering, instead of throttling at every command
buffer we only throttle once a frame.
Stéphane Marchesin [Sat, 9 Mar 2013 00:16:33 +0000 (16:16 -0800)]
i915g: Update TODO
Brian Paul [Fri, 8 Mar 2013 17:32:39 +0000 (10:32 -0700)]
docs: document another Viewperf bug
Jan de Groot [Thu, 7 Mar 2013 18:48:13 +0000 (19:48 +0100)]
dri/nouveau: fix crash in nouveau_flush
https://bugs.freedesktop.org/show_bug.cgi?id=61947
Note: this is a candidate for the stable branches
Brian Paul [Thu, 7 Mar 2013 15:10:56 +0000 (08:10 -0700)]
draw: add const qualifier to silence compiler warning
Brian Paul [Wed, 6 Mar 2013 23:57:20 +0000 (16:57 -0700)]
llvmpipe: remove the power of two sizeof(struct cmd_block) assertion
It fails on 32-bit systems (I only tested on 64-bit). Power of two
size isn't required, so just remove the assertion.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Wed, 6 Mar 2013 19:08:17 +0000 (12:08 -0700)]
vbo: fix crash found with shared display lists
This fixes a crash when a display list is created in one context
but executed from a second one. The vbo_save_context::vertex_store
memeber will be NULL if we never created a display list with the
context. Just check for that before dereferencing the pointer.
Fixes http://bugzilla.redhat.com/show_bug.cgi?id=918661
Note: This is a candidate for the stable branches.
Alan Hourihane [Wed, 6 Mar 2013 18:14:01 +0000 (18:14 +0000)]
mesa: fix glGetInteger*(GL_SAMPLER_BINDING).
If the sampler object has been deleted on another context, an
alternative context may reference the old sampler. So ensure the sampler
object still exists.
Note: this is a candidate for the stable branch.
Signed-off-by: Alan Hourihane <alanh@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Christian König [Thu, 7 Mar 2013 09:06:24 +0000 (10:06 +0100)]
radeon/llvm: document LLVM commit
We need at least that revision to work correctly now.
Signed-off-by: Christian König <christian.koenig@amd.com>
Christian König [Wed, 27 Feb 2013 21:40:24 +0000 (22:40 +0100)]
radeon/llvm: enable LICM and DCE pass v2
LICM stands for Loop Invariant Code Motion. Instructions that
does not depend of loop index are moved outside of loop body.
DCE is DeadCodeElimination.
v2: updated commit msg, thx to Vincent.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Christian König [Wed, 27 Feb 2013 21:39:26 +0000 (22:39 +0100)]
radeonsi: add LLVMNoUnwindAttribute to intrinsic
So LLVM can better eliminate dead code.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Christian König [Tue, 5 Mar 2013 14:07:39 +0000 (15:07 +0100)]
radeonsi: rework input interpolation
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Christian König [Tue, 5 Mar 2013 11:14:02 +0000 (12:14 +0100)]
radeonsi: remove SI.vs.load.buffer.index
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Christian König [Mon, 4 Mar 2013 15:30:06 +0000 (16:30 +0100)]
radeon/llvm: make SGPRs proper function arguments v2
v2: remove unrelated changes
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Christian König [Mon, 4 Mar 2013 14:35:30 +0000 (15:35 +0100)]
radeon/llvm: replace shader type intrinsic with function attribute
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Christian König [Fri, 1 Mar 2013 10:34:16 +0000 (11:34 +0100)]
radeonsi: switch to v*i8 for resources and samplers v2
v2: remove unrelated changes
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Christian König [Thu, 7 Mar 2013 09:02:24 +0000 (10:02 +0100)]
r600g/llvm: Update CONSTANT_BUFFER address space definition
To match recent LLVM changes.
Signed-off-by: Christian König <christian.koenig@amd.com>
Zack Rusin [Wed, 27 Feb 2013 09:28:18 +0000 (01:28 -0800)]
draw/llvm: fix inputs to the geometry shader
We can't clip and viewport transform the vertices before we let
the geometry shader process them. Lets make sure the generated
vertex shader has both disabled if geometry shader is present.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Bryan Cain [Fri, 15 Feb 2013 16:09:12 +0000 (10:09 -0600)]
draw: use geometry shader info in clip_init_state if appropriate
Reviewed-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Bryan Cain [Fri, 15 Feb 2013 16:05:36 +0000 (10:05 -0600)]
draw: account for separate shader objects in geometry shader code
The geometry shader code seems to have been originally written with the
assumptions that there are the same number of VS outputs as GS outputs and
that VS outputs are in the same order as their corresponding GS inputs. Since
TGSI uses separate shader objects, these are both wrong assumptions. This
was causing several valid vertex/geometry shader combinations to either render
incorrectly or trigger an assertion.
Conflicts:
src/gallium/auxiliary/draw/draw_gs.c
Reviewed-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Alan Hourihane [Wed, 6 Mar 2013 16:08:58 +0000 (16:08 +0000)]
Unreference sampler object when it's currently bound to texture unit.
This change specifically unbinds a sampler object from the texture unit
if it's bound to a unit. The spec calls for default object when deleting
sampler objects which are currently bound.
Note: this is a candidate for the stable branches
Signed-off-by: Alan Hourihane <alanh@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Brian Paul [Wed, 6 Mar 2013 01:08:50 +0000 (18:08 -0700)]
llvmpipe: fix incorrect 'j' array index in dummy texture code
Use 0 instead.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Mon, 4 Mar 2013 21:44:47 +0000 (14:44 -0700)]
llvmpipe: remove unused cmd_block_list struct
Brian Paul [Mon, 4 Mar 2013 21:38:20 +0000 (14:38 -0700)]
llvmpipe: add some scene limit sanity check assertions
Note: This is a candidate for the stable branches.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Mon, 4 Mar 2013 21:33:04 +0000 (14:33 -0700)]
llvmpipe: tweak CMD_BLOCK_MAX and LP_SCENE_MAX_SIZE
We advertise a max texture/surfaces size of 8K x 8K but the old values
for these limits didn't actually allow us to handle that surface size.
For 8K x 8K we'll have 16384 bins. Each bin needs at least one cmd_block
object which was 2192 bytes in size. Since 16384 * 2192 exceeded
LP_SCENE_MAX_SIZE we'd silently fail in lp_scene_new_data_block() and not
draw the complete scene.
By reducing CMD_BLOCK_MAX to 29 we get nice 512-byte cmd_blocks. And
by increasing LP_SCENE_MAX_SIZE to 9 MB we can allocate enough command
blocks for 8K x 8K, plus a few regular data blocks.
Fixes the (improved) piglit fbo-maxsize test.
Note: This is a candidate for the stable branches.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Kenneth Graunke [Mon, 4 Mar 2013 19:38:28 +0000 (11:38 -0800)]
i965: Don't fill buffer with zeroes.
This was only necessary because our bounds checking was off by one, and
thus we read an extra pair of values.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Mon, 4 Mar 2013 19:37:35 +0000 (11:37 -0800)]
i965: Fix off-by-one in query object result gathering.
If we've written N pairs of values to the buffer, then last_index = N,
but the values are 0 .. N-1. Thus, we need to use <, not <=.
This worked anyway because we fill the buffer with zeroes, so we just
added an extra (0 - 0) to our results.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Christian König [Wed, 6 Mar 2013 11:08:54 +0000 (12:08 +0100)]
radeon/llvm: fix trivial warnings
Signed-off-by: Christian König <christian.koenig@amd.com>
Christian König [Wed, 6 Mar 2013 10:49:53 +0000 (11:49 +0100)]
radeonsi: fix trivial warning
Signed-off-by: Christian König <christian.koenig@amd.com>
Eric Anholt [Tue, 29 Jan 2013 00:59:29 +0000 (11:59 +1100)]
intel: Improve the matching (more formats!) for TexImage from PBOs.
Mesa core is the place for encoding what format/type matches a mesa
format, so rely on that.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Eric Anholt [Mon, 28 Jan 2013 06:44:17 +0000 (17:44 +1100)]
intel: Improve the test for readpixels blit path format checking.
We were allowing things like copying RG1616 to a user's ARGB8888
format, while we were denying anything that wasn't ARGB8888 or
RGB565.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Eric Anholt [Mon, 28 Jan 2013 00:32:49 +0000 (11:32 +1100)]
intel: Fold intel_region_copy() into its one caller.
This is similar code to intel_miptree_copy_slice, but the knobs
are all set differently.
v2: fix whitespace
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Eric Anholt [Sun, 27 Jan 2013 22:14:42 +0000 (09:14 +1100)]
intel: Transition intel_region_map() to being a miptree operation.
I'm trying to move us away from the region structure, and all the
callers are currently dereferencing a miptree to get the region.
In this change, the map_refcount is dropped. However, the bo->virtual is
itself map refcounted, so that's already dealt with.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Eric Anholt [Sun, 27 Jan 2013 21:57:15 +0000 (08:57 +1100)]
intel: Remove num_mapped_regions tracking.
The point of tracking the value was removed in February 2012
(
65b096aeddd9b45ca038f44cc9adfff86c8c48b2), and this should have
been removed at the same time.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Eric Anholt [Sun, 27 Jan 2013 19:37:43 +0000 (05:37 +1000)]
intel: Remove the struct intel_region reuse hash table.
I don't see any reason for it -- it was introduced with the DRI2
invalidate work by krh in 2010 with no explanation. I suspect it was
something about wanting the same drm_intel_bo struct underneath multiple
openings of the BO within one process, but that's covered by libdrm at
this point. As far as the struct region goes, it is not threadsafe, so
multiple contexts sharing a region could have mixed up the map_count and
assertion failed or worse.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
José Fonseca [Tue, 5 Mar 2013 22:46:38 +0000 (22:46 +0000)]
scons: Provide shorthand aliases for software winsyses.
José Fonseca [Tue, 5 Mar 2013 22:46:01 +0000 (22:46 +0000)]
scons: Fix llvm-config not found error message.
"% llvm_version" is bogus copy'n'past cruft.