Christoph Bumiller [Sun, 8 Apr 2012 21:38:55 +0000 (23:38 +0200)]
nv50/ir: fix Instruction::isCommutationLegal for WAW
Francisco Jerez [Sun, 8 Apr 2012 21:14:15 +0000 (23:14 +0200)]
nv50/ir/opt: Add isOptSupported() check in logical arith optimization.
Francisco Jerez [Tue, 27 Dec 2011 11:43:27 +0000 (12:43 +0100)]
nv50/ir/ra: Fix live set propagation in the secondary passes of buildLiveSets().
Christoph Bumiller [Tue, 7 Feb 2012 21:39:20 +0000 (22:39 +0100)]
nv50/ir/opt: don't regard OP_WRSV as dead code
Christoph Bumiller [Sat, 14 Apr 2012 19:30:52 +0000 (21:30 +0200)]
nv50/ir: add isUniform query to Values
Christoph Bumiller [Mon, 9 Apr 2012 18:58:39 +0000 (20:58 +0200)]
nv50/ir: rewrite the register allocator as GCRA, with spilling
This is more flexible than the linear scan, and we don't need the
separate allocation pass for constrained values anymore.
Christoph Bumiller [Thu, 5 Apr 2012 21:14:33 +0000 (23:14 +0200)]
nv50/ir/tgsi: only export x-component of PSIZE
Christoph Bumiller [Thu, 5 Apr 2012 20:53:46 +0000 (22:53 +0200)]
nvc0: fix emission of 3rd src in SET_AND,OR,XOR
Francisco Jerez [Mon, 9 Apr 2012 18:48:43 +0000 (20:48 +0200)]
nv50/ir: Fix BuildUtil::mkSelect and mkClobber
Christoph Bumiller [Fri, 6 Apr 2012 17:18:05 +0000 (19:18 +0200)]
nv50/ir: fix reg file conflicts with undefined-value placeholders
Christoph Bumiller [Mon, 2 Apr 2012 18:55:03 +0000 (20:55 +0200)]
nv50/ir/opt: silence warning (int < Elements() signedness)
Christoph Bumiller [Mon, 2 Apr 2012 18:53:46 +0000 (20:53 +0200)]
nv50/ir/opt: fix combineSt access to wrong instruction
Christoph Bumiller [Sun, 29 Jan 2012 14:41:52 +0000 (15:41 +0100)]
nv50/ir/opt: another insn NULL check in phi elimination
Francisco Jerez [Sun, 27 Nov 2011 12:06:10 +0000 (13:06 +0100)]
nv50/ir/ssa: Take into account function inputs and outputs.
Francisco Jerez [Tue, 27 Mar 2012 19:48:58 +0000 (21:48 +0200)]
nv50/ir: Clean up before calculating instruction ordering for a new function.
Francisco Jerez [Tue, 15 Nov 2011 16:24:18 +0000 (17:24 +0100)]
nv50/ir/ra: Allocate registers for function arguments.
Francisco Jerez [Fri, 6 Apr 2012 17:16:04 +0000 (19:16 +0200)]
nv50/ir: Take into account function args in the live range calculation code.
Francisco Jerez [Thu, 29 Mar 2012 21:23:53 +0000 (23:23 +0200)]
nv50/ir/ra: Use matching physical regs for function args in caller and callee.
Francisco Jerez [Fri, 6 Apr 2012 17:08:27 +0000 (19:08 +0200)]
nv50/ir/tgsi: Infer function inputs/outputs.
Edit: Don't do it for the main function of (graphics) shaders,
its inputs and outputs always go through TGSI_FILE_INPUT/OUTPUT.
This prevents all TEMPs from counting as live out and reduces
register pressure.
Francisco Jerez [Tue, 27 Mar 2012 15:29:55 +0000 (17:29 +0200)]
nv50/ir/tgsi: Replace the inlining logic with proper function calls.
Francisco Jerez [Tue, 27 Mar 2012 15:30:31 +0000 (17:30 +0200)]
nv50/ir: Decouple DataArray from the dictionary that maps locations to values.
The point is to keep an independent dictionary for each function.
The array that was being used as dictionary has been converted into a
"bimap" for two different reasons: first, because having an almost
empty instance of an array with as many entries as registers there are
in the program, once for every function, would be wasteful, and
second, because we want to be able to map Value pointers back to
locations at some point.
Christoph Bumiller [Thu, 22 Mar 2012 10:59:32 +0000 (11:59 +0100)]
nv50/ir/opt: don't delete instruction in removeFlow before its last use
Christoph Bumiller [Thu, 22 Mar 2012 10:58:31 +0000 (11:58 +0100)]
nv50/ir/opt: check BB equality before instruction ordering in CSE
Christoph Bumiller [Thu, 22 Mar 2012 10:51:52 +0000 (11:51 +0100)]
nv50/ir/opt: don't copy-propagate cond MOVs or MOVs to other reg files
We've never encountered the latter on nvc0, but on nv50 we have moves
between GPRs and address regs.
Christoph Bumiller [Tue, 7 Feb 2012 19:45:03 +0000 (20:45 +0100)]
nv50/ir/opt: don't replace conditional definitions in CSE
Francisco Jerez [Thu, 17 Nov 2011 17:23:28 +0000 (18:23 +0100)]
nv50/ir/opt: Update the symbol size when combining loads and stores.
Christoph Bumiller [Wed, 21 Dec 2011 16:06:27 +0000 (17:06 +0100)]
nv50/ir: initialize FlowInstruction::builtin
Francisco Jerez [Wed, 21 Mar 2012 22:53:01 +0000 (23:53 +0100)]
nv50/ir/opt: Fix for function calls.
Francisco Jerez [Fri, 6 Apr 2012 16:50:56 +0000 (18:50 +0200)]
nv50/ir: Build a "symbol" table with the binary offsets of each function.
Francisco Jerez [Mon, 14 Nov 2011 23:18:28 +0000 (00:18 +0100)]
nv50/ir: Add support for removing functions from a program.
Francisco Jerez [Mon, 9 Apr 2012 19:18:31 +0000 (21:18 +0200)]
nv50/ir: Scan program functions in DFS-postorder.
The reason is that several passes (regalloc, function argument
binding, inlining) are going to require the callees of a function to
be processed before the caller.
Francisco Jerez [Fri, 6 Apr 2012 16:43:29 +0000 (18:43 +0200)]
nv50/ir: Deal with graph iterators using RAII.
Francisco Jerez [Tue, 15 Nov 2011 01:07:21 +0000 (02:07 +0100)]
nv50/ir: Add convenience method for calculating the live sets of a function.
Francisco Jerez [Wed, 21 Mar 2012 20:43:26 +0000 (21:43 +0100)]
nv50/ir: Add support code for calculating the clobber set of a BB or function.
Francisco Jerez [Mon, 9 Apr 2012 18:43:28 +0000 (20:43 +0200)]
nv50/ir/opt: Don't lose modifiers during constant folding.
Francisco Jerez [Tue, 20 Mar 2012 23:39:00 +0000 (00:39 +0100)]
nv50/ir/opt: Improve modifier handling.
Francisco Jerez [Sat, 14 Apr 2012 19:25:22 +0000 (21:25 +0200)]
nv50/ir: Add support for cloning FlowInsns, ImmediateVals and BBs.
Francisco Jerez [Sat, 14 Apr 2012 19:24:16 +0000 (21:24 +0200)]
nv50/ir: Decouple object cloning logic from the sub-object recursion policy.
Francisco Jerez [Sat, 14 Apr 2012 19:23:03 +0000 (21:23 +0200)]
nv50/ir: Make sure that several IR objects are destroyed on takedown.
Christoph Bumiller [Mon, 9 Apr 2012 18:40:35 +0000 (20:40 +0200)]
nv50/ir: make Instruction::src/def container private
Francisco Jerez [Thu, 29 Mar 2012 19:18:24 +0000 (21:18 +0200)]
nv50/ir: Add support for unlimited instruction arguments.
Christoph Bumiller [Thu, 29 Mar 2012 19:32:41 +0000 (21:32 +0200)]
nv50/ir: temporarily exclude nv50 code emitter from build
It's not used yet and shouldn't have been included in the first
place.
Christoph Bumiller [Fri, 6 Apr 2012 16:37:24 +0000 (18:37 +0200)]
nv50/ir: copy value size in SSA-rename pass
Christoph Bumiller [Mon, 9 Apr 2012 18:34:24 +0000 (20:34 +0200)]
nv50/ir/opt: improve post-multiply and check target for support
Christoph Bumiller [Wed, 28 Mar 2012 21:50:32 +0000 (23:50 +0200)]
nv50/ir: add setFlagsDef/Src helper
Will be used by nv50 target.
Christoph Bumiller [Fri, 6 Apr 2012 16:34:44 +0000 (18:34 +0200)]
nv50/ir: add isAccessSupported check for memory access coalescing
Christoph Bumiller [Wed, 28 Mar 2012 19:30:59 +0000 (21:30 +0200)]
nv50/ir: add function for splitting a BasicBlock
Fixes to initial implementation by Francisco Jerez.
Francisco Jerez [Tue, 15 Nov 2011 20:39:52 +0000 (21:39 +0100)]
nv50/ir: Allow attaching two nodes when either one is already inside the graph.
Francisco Jerez [Tue, 15 Nov 2011 20:39:22 +0000 (21:39 +0100)]
nv50/ir: Allow inserting isolated nodes to a graph.
Francisco Jerez [Mon, 14 Nov 2011 23:38:15 +0000 (00:38 +0100)]
nv50/ir: Fix memory corruption in Function::orderInstructions().
"iter" doesn't reference a BasicBlock directly, but a Node::Graph,
i.e. BasicBlock::get() is casting to the wrong pointer type.
Francisco Jerez [Tue, 15 Nov 2011 14:58:04 +0000 (15:58 +0100)]
nv50/ir: Fix up insertion of PHI instructions using bb->insertHead().
Christoph Bumiller [Tue, 15 Nov 2011 23:39:41 +0000 (00:39 +0100)]
nv50/ir: fix insertHead and remove for BBs with PHI ops only
Francisco Jerez [Sat, 19 Nov 2011 20:31:28 +0000 (21:31 +0100)]
nv50/ir: Don't crash on zero sized BitSets.
Francisco Jerez [Tue, 15 Nov 2011 00:50:58 +0000 (01:50 +0100)]
nv50/ir: Fix Interval::clear().
Christoph Bumiller [Sun, 25 Dec 2011 17:34:35 +0000 (18:34 +0100)]
nv50/ir/tgsi: handle inferSrcType(NOT) to be u32
Francisco Jerez [Mon, 14 Nov 2011 22:09:45 +0000 (23:09 +0100)]
nv50/ir/opt: Fix OP_NOT to modifier conversion.
Dave Airlie [Sat, 14 Apr 2012 19:25:59 +0000 (20:25 +0100)]
r600g: disable dual-src hangs evergreen for some reason.
This did work previously, so I've broken something.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Tom Stellard [Sat, 14 Apr 2012 16:11:29 +0000 (12:11 -0400)]
r300/compiler: Exit immediately from rc_vert_fc() if there is an error
This way we correctly report "Too many temporaries" errors.
https://bugs.freedesktop.org/show_bug.cgi?id=48680
Note: This is a candidate for the stable branches.
Tom Stellard [Sat, 14 Apr 2012 14:02:19 +0000 (10:02 -0400)]
r300/compiler: Copy all instruction attributes during local transfoms
Instruction attributes like WriteALUResult and ALUResultCompare
were being discarded during the some of the local transformations.
This fixes the following piglit tests:
glsl1-inequality (vec2, pass)
loopfunc
fs-any-bvec2-using-if
fs-op-ne-bvec2-bvec2-using-if
fs-op-ne-ivec2-ivec2-using-if
fs-op-ne-mat2-mat2-using-if
fs-op-ne-vec2-vec2-using-if
fs-op-ne-mat2x3-mat2x3-using-if
fs-op-ne-mat2x4-mat2x4-using-if
https://bugs.freedesktop.org/show_bug.cgi?id=45921
NOTE: This is a candidate for the stable branches.
Tom Stellard [Wed, 21 Sep 2011 04:05:55 +0000 (21:05 -0700)]
r300/compiler: Fix nested flow control in r500 vertex shaders
Tom Stellard [Fri, 13 Apr 2012 02:07:40 +0000 (22:07 -0400)]
r300/compiler: Clear loop registers in vertex shaders w/o loops
The loop registers weren't being cleared, so any shader that was
executed after a shader containing loops was at risk of having a loop
randomly inserted into it.
This fixes over one hundred piglit tests, although these test
only failed during full piglit runs and would pass if
run individually. The exact number of piglit tests that this patch
fixes will vary depending on the version of piglit and the order the
tests are run.
NOTE: This is a candidate for the stable branches.
Eric Anholt [Fri, 16 Mar 2012 22:44:25 +0000 (15:44 -0700)]
glsl: If an "if" has no "then" or "else" code left, remove it.
Cuts 8/1068 instructions from glyphy's fragment shaders on i965.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Eric Anholt [Mon, 19 Mar 2012 23:37:23 +0000 (16:37 -0700)]
glsl: Add a helper for generating temporary variables in ir_builder.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Mon, 19 Mar 2012 23:27:34 +0000 (16:27 -0700)]
glsl: Add a helper for ir_builder to make dereferences for assignments.
v2: Fix writemask setup for non-vec4 assignments.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Mon, 19 Mar 2012 23:01:52 +0000 (16:01 -0700)]
glsl: Make a little tracking class for emitting IR lists.
This lets us significantly shorten p->instructions->push_tail(ir), and
will be used in a few more places.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Mon, 19 Mar 2012 21:26:04 +0000 (14:26 -0700)]
glsl: Add common swizzles to ir_builder.
Now we can fold a bunch of our expression setup in ff_fragment_shader
into single-line, parseable commits.
v2: Make it actually work. I wasn't setting num_components in the
mask structure, and not setting up a mask structure is way easier.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Mon, 19 Mar 2012 21:04:23 +0000 (14:04 -0700)]
glsl: Let ir_builder expressions take un-dereferenced variables.
Having to explicitly dereference is irritating and bloats the code,
when the compiler can detect and do the right thing.
v2: Use a little shim class to produce the automatic dereference
generation at compile time as opposed to runtime, while also
allowing compile-time type checking.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Mon, 19 Mar 2012 20:27:06 +0000 (13:27 -0700)]
glsl: Create an ir_builder helper for hand-generating IR.
The C++ constructors with placement new, while functional, are
extremely verbose, leading to generation of simple GLSL IR expressions
like (a * b + c * d) expanding to many lines of code and using lots of
temporary variables. By creating a new ir_builder.h that puts simple
generators in our namespace and taking advantage of ralloc_parent(),
we can generate much more compact code, at a minor runtime cost.
v2: Replace ir_instruction usage with just ir_rvalue.
v3: Drop remaining missed as_rvalue() in v2.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Christoph Bumiller [Thu, 8 Mar 2012 20:41:41 +0000 (21:41 +0100)]
nv50,nvc0: fix handling of user vbufs with stride < access size
Christoph Bumiller [Tue, 28 Feb 2012 18:25:57 +0000 (19:25 +0100)]
nvc0: prefix all macro methods with MACRO
Some of them have non-macro counterparts.
Christoph Bumiller [Sat, 14 Apr 2012 04:08:08 +0000 (06:08 +0200)]
nvc0: replace VERTEX_DATA push mode with translate to buffer
While pushing vertices through the FIFO is relatively fast on nv50,
it's horribly slow on nvc0.
Christoph Bumiller [Fri, 16 Mar 2012 16:37:32 +0000 (17:37 +0100)]
nvc0: improve vertex state validation
Now updating vertex attribute format only when necessary.
Christoph Bumiller [Thu, 8 Mar 2012 14:56:11 +0000 (15:56 +0100)]
nvc0: track texture dirty state individually
Christoph Bumiller [Thu, 1 Mar 2012 20:28:29 +0000 (21:28 +0100)]
nv50,nvc0: use new scratch buffers code
Christoph Bumiller [Sat, 14 Apr 2012 03:38:16 +0000 (05:38 +0200)]
nouveau: add new shared scratch buffers
Christoph Bumiller [Thu, 1 Mar 2012 20:23:06 +0000 (21:23 +0100)]
nvc0: only force early fragment tests if requested by shader
Christoph Bumiller [Wed, 7 Mar 2012 18:44:10 +0000 (19:44 +0100)]
nv50,nvc0: hold references to the framebuffer surfaces
Marek Olšák [Fri, 13 Apr 2012 15:51:42 +0000 (17:51 +0200)]
r300g: align vertex buffer suballocations to 4
Marek Olšák [Fri, 13 Apr 2012 15:51:42 +0000 (17:51 +0200)]
u_blitter: align vertex buffer suballocations to 4
Brian Paul [Fri, 13 Apr 2012 20:31:16 +0000 (14:31 -0600)]
docs: document another viewperf bug in Maya-03
Marcin Slusarz [Fri, 13 Apr 2012 19:55:56 +0000 (21:55 +0200)]
xorg/nouveau: switch to libdrm_nouveau-2.0
Martin Peres [Fri, 13 Apr 2012 18:53:02 +0000 (20:53 +0200)]
targets/{egl-static,gbm}: further clean-up the nvfx remains
Christoph Bumiller [Sat, 14 Apr 2012 01:05:02 +0000 (03:05 +0200)]
nvc0: remove include of old libdrm_nouveau's nouveau_reloc.h
Christoph Bumiller [Sat, 14 Apr 2012 00:39:16 +0000 (02:39 +0200)]
nv50,nvc0: handle PIPE_CAP_MAX_DUAL_SOURCE_RENDER_TARGETS
Christoph Bumiller [Sat, 14 Apr 2012 00:38:25 +0000 (02:38 +0200)]
nv30: s/DUAL_SOURCE_BLEND/MAX_DUAL_SOURCE_RENDER_TARGETS
Merge accident.
Ben Skeggs [Wed, 11 Jan 2012 11:42:07 +0000 (12:42 +0100)]
nv30: import new driver for GeForce FX/6/7 chipsets, and Quadro variants
The primary motivation for this rewrite was to have a maintainable driver
going forward, as nvfx was quite horrible in a lot of ways.
The driver is heavily based on the design of the nv50/nvc0 3d drivers we
already have, and uses the same common buffer/fence code. It also passes
a HEAP more piglit tests than nvfx did, supports a couple more features,
and a few more to come still probably.
The CPU footprint of this driver is far far less than nvfx, and translates
into far greater framerates in a lot of applications (unless you're using
a CPU that's way way newer than the GPUs of these generations....)
Basically, we once again have a maintained driver for these chipsets \o/
Feel free to report bugs now!
Christoph Bumiller [Fri, 6 Apr 2012 13:41:55 +0000 (15:41 +0200)]
nouveau: switch to libdrm_nouveau-2.0
Christoph Bumiller [Sun, 12 Feb 2012 23:33:55 +0000 (00:33 +0100)]
nvc0: remove obsolete nvc0_push2.c
Slower version of nvc0_push.c, was only used to ascertain that
bugs were not the new version's fault.
Christoph Bumiller [Fri, 10 Feb 2012 12:18:13 +0000 (13:18 +0100)]
nouveau: remove automatic buffer migration heuristics
Ben Skeggs [Thu, 16 Feb 2012 12:08:41 +0000 (22:08 +1000)]
nvfx: completely remove this driver (GeForce FX/6/7)
This driver hasn't been maintained properly for a very long time, and for
many very good reasons. It's horrible.
A new driver supporting these chipsets will appear with the commits that
port vieux/nv50/nvc0 to libdrm_nouveau-2.0.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Fri, 13 Apr 2012 07:50:37 +0000 (17:50 +1000)]
nouveau: rework and simplify nv04/nv05 driver a bit
TEXTURED_TRIANGLE and MULTITEX_TRIANGLE are both a bit special in that if
you use any other graph object in the meantime they'll forget their state
and spew a lovely METHOD_CNT error at you when you try to draw.
The pre-newlib driver has a flush_notify() hook which does this state
re-emit, and a number of random workarounds like extra flushes and state
dirtying after various operations to solve this issue.
I'm taking a slightly different approach to things instead, which has the
nice side-effect of removing the divergent code-paths for ttri/mtri, the
flush/dirty workarounds and the need for flush_notify. Also gives a few
FPS boost in OA, yay.
Ben Skeggs [Fri, 23 Dec 2011 04:03:49 +0000 (14:03 +1000)]
nouveau/vieux: switch to libdrm_nouveau-2.0
Dave Airlie [Fri, 13 Apr 2012 16:15:47 +0000 (17:15 +0100)]
docs: update GL3.txt for ARB_blend_func_extended
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 13 Apr 2012 16:13:01 +0000 (17:13 +0100)]
gallium: document dual source blending restrictions on gallium
As per Brian's suggestion, document the restrictions on dual src blending.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sat, 24 Mar 2012 13:37:16 +0000 (13:37 +0000)]
r600g: initial r600 dual src blending support
survives piglit with no regressions on rv610/evergreen
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sat, 24 Mar 2012 13:36:59 +0000 (13:36 +0000)]
softpipe: add dual source blending support
This adds support for a single dual source blending MRT to softpipe.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sat, 24 Mar 2012 14:28:03 +0000 (14:28 +0000)]
util: add dual blend helper function (v2)
This is just a function to tell if a certain blend mode requires dual sources.
v2: move to inlines as per Brian's suggestion
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sat, 24 Mar 2012 13:36:17 +0000 (13:36 +0000)]
st/mesa: add ARB_blend_func_extended support to state tracker.
This adds the blend mode mapping, it also uses the var->index in the
glsl to tgsi convertor - this is the other half of my using 4 in the GLSL
compiler.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sat, 24 Mar 2012 13:34:45 +0000 (13:34 +0000)]
gallium: rename DUAL_SOURCE_BLEND cap to MAX_DUAL_SOURCE_RENDER_TARGETS
Though I don't think we'll ever expose > 1.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sat, 24 Mar 2012 13:33:41 +0000 (13:33 +0000)]
glsl: add support for ARB_blend_func_extended (v3)
This adds index support to the GLSL compiler.
I'm not 100% sure of my approach here, esp without how output ordering
happens wrt location, index pairs, in the "mark" function.
Since current hw doesn't ever have a location > 0 with an index > 0,
we don't have to work out if the output ordering the hw requires is
location, index, location, index or location, location, index, index.
But we have no hw to know, so punt on it for now.
v2: index requires layout - catch and error
setup explicit index properly.
v3: drop idx_offset stuff, assume index follow location
Signed-off-by: Dave Airlie <airlied@redhat.com>