mesa.git
12 years agonv50/ir: rewrite the register allocator as GCRA, with spilling
Christoph Bumiller [Mon, 9 Apr 2012 18:58:39 +0000 (20:58 +0200)]
nv50/ir: rewrite the register allocator as GCRA, with spilling

This is more flexible than the linear scan, and we don't need the
separate allocation pass for constrained values anymore.

12 years agonv50/ir/tgsi: only export x-component of PSIZE
Christoph Bumiller [Thu, 5 Apr 2012 21:14:33 +0000 (23:14 +0200)]
nv50/ir/tgsi: only export x-component of PSIZE

12 years agonvc0: fix emission of 3rd src in SET_AND,OR,XOR
Christoph Bumiller [Thu, 5 Apr 2012 20:53:46 +0000 (22:53 +0200)]
nvc0: fix emission of 3rd src in SET_AND,OR,XOR

12 years agonv50/ir: Fix BuildUtil::mkSelect and mkClobber
Francisco Jerez [Mon, 9 Apr 2012 18:48:43 +0000 (20:48 +0200)]
nv50/ir: Fix BuildUtil::mkSelect and mkClobber

12 years agonv50/ir: fix reg file conflicts with undefined-value placeholders
Christoph Bumiller [Fri, 6 Apr 2012 17:18:05 +0000 (19:18 +0200)]
nv50/ir: fix reg file conflicts with undefined-value placeholders

12 years agonv50/ir/opt: silence warning (int < Elements() signedness)
Christoph Bumiller [Mon, 2 Apr 2012 18:55:03 +0000 (20:55 +0200)]
nv50/ir/opt: silence warning (int < Elements() signedness)

12 years agonv50/ir/opt: fix combineSt access to wrong instruction
Christoph Bumiller [Mon, 2 Apr 2012 18:53:46 +0000 (20:53 +0200)]
nv50/ir/opt: fix combineSt access to wrong instruction

12 years agonv50/ir/opt: another insn NULL check in phi elimination
Christoph Bumiller [Sun, 29 Jan 2012 14:41:52 +0000 (15:41 +0100)]
nv50/ir/opt: another insn NULL check in phi elimination

12 years agonv50/ir/ssa: Take into account function inputs and outputs.
Francisco Jerez [Sun, 27 Nov 2011 12:06:10 +0000 (13:06 +0100)]
nv50/ir/ssa: Take into account function inputs and outputs.

12 years agonv50/ir: Clean up before calculating instruction ordering for a new function.
Francisco Jerez [Tue, 27 Mar 2012 19:48:58 +0000 (21:48 +0200)]
nv50/ir: Clean up before calculating instruction ordering for a new function.

12 years agonv50/ir/ra: Allocate registers for function arguments.
Francisco Jerez [Tue, 15 Nov 2011 16:24:18 +0000 (17:24 +0100)]
nv50/ir/ra: Allocate registers for function arguments.

12 years agonv50/ir: Take into account function args in the live range calculation code.
Francisco Jerez [Fri, 6 Apr 2012 17:16:04 +0000 (19:16 +0200)]
nv50/ir: Take into account function args in the live range calculation code.

12 years agonv50/ir/ra: Use matching physical regs for function args in caller and callee.
Francisco Jerez [Thu, 29 Mar 2012 21:23:53 +0000 (23:23 +0200)]
nv50/ir/ra: Use matching physical regs for function args in caller and callee.

12 years agonv50/ir/tgsi: Infer function inputs/outputs.
Francisco Jerez [Fri, 6 Apr 2012 17:08:27 +0000 (19:08 +0200)]
nv50/ir/tgsi: Infer function inputs/outputs.

Edit: Don't do it for the main function of (graphics) shaders,
its inputs and outputs always go through TGSI_FILE_INPUT/OUTPUT.
This prevents all TEMPs from counting as live out and reduces
register pressure.

12 years agonv50/ir/tgsi: Replace the inlining logic with proper function calls.
Francisco Jerez [Tue, 27 Mar 2012 15:29:55 +0000 (17:29 +0200)]
nv50/ir/tgsi: Replace the inlining logic with proper function calls.

12 years agonv50/ir: Decouple DataArray from the dictionary that maps locations to values.
Francisco Jerez [Tue, 27 Mar 2012 15:30:31 +0000 (17:30 +0200)]
nv50/ir: Decouple DataArray from the dictionary that maps locations to values.

The point is to keep an independent dictionary for each function.

The array that was being used as dictionary has been converted into a
"bimap" for two different reasons: first, because having an almost
empty instance of an array with as many entries as registers there are
in the program, once for every function, would be wasteful, and
second, because we want to be able to map Value pointers back to
locations at some point.

12 years agonv50/ir/opt: don't delete instruction in removeFlow before its last use
Christoph Bumiller [Thu, 22 Mar 2012 10:59:32 +0000 (11:59 +0100)]
nv50/ir/opt: don't delete instruction in removeFlow before its last use

12 years agonv50/ir/opt: check BB equality before instruction ordering in CSE
Christoph Bumiller [Thu, 22 Mar 2012 10:58:31 +0000 (11:58 +0100)]
nv50/ir/opt: check BB equality before instruction ordering in CSE

12 years agonv50/ir/opt: don't copy-propagate cond MOVs or MOVs to other reg files
Christoph Bumiller [Thu, 22 Mar 2012 10:51:52 +0000 (11:51 +0100)]
nv50/ir/opt: don't copy-propagate cond MOVs or MOVs to other reg files

We've never encountered the latter on nvc0, but on nv50 we have moves
between GPRs and address regs.

12 years agonv50/ir/opt: don't replace conditional definitions in CSE
Christoph Bumiller [Tue, 7 Feb 2012 19:45:03 +0000 (20:45 +0100)]
nv50/ir/opt: don't replace conditional definitions in CSE

12 years agonv50/ir/opt: Update the symbol size when combining loads and stores.
Francisco Jerez [Thu, 17 Nov 2011 17:23:28 +0000 (18:23 +0100)]
nv50/ir/opt: Update the symbol size when combining loads and stores.

12 years agonv50/ir: initialize FlowInstruction::builtin
Christoph Bumiller [Wed, 21 Dec 2011 16:06:27 +0000 (17:06 +0100)]
nv50/ir: initialize FlowInstruction::builtin

12 years agonv50/ir/opt: Fix for function calls.
Francisco Jerez [Wed, 21 Mar 2012 22:53:01 +0000 (23:53 +0100)]
nv50/ir/opt: Fix for function calls.

12 years agonv50/ir: Build a "symbol" table with the binary offsets of each function.
Francisco Jerez [Fri, 6 Apr 2012 16:50:56 +0000 (18:50 +0200)]
nv50/ir: Build a "symbol" table with the binary offsets of each function.

12 years agonv50/ir: Add support for removing functions from a program.
Francisco Jerez [Mon, 14 Nov 2011 23:18:28 +0000 (00:18 +0100)]
nv50/ir: Add support for removing functions from a program.

12 years agonv50/ir: Scan program functions in DFS-postorder.
Francisco Jerez [Mon, 9 Apr 2012 19:18:31 +0000 (21:18 +0200)]
nv50/ir: Scan program functions in DFS-postorder.

The reason is that several passes (regalloc, function argument
binding, inlining) are going to require the callees of a function to
be processed before the caller.

12 years agonv50/ir: Deal with graph iterators using RAII.
Francisco Jerez [Fri, 6 Apr 2012 16:43:29 +0000 (18:43 +0200)]
nv50/ir: Deal with graph iterators using RAII.

12 years agonv50/ir: Add convenience method for calculating the live sets of a function.
Francisco Jerez [Tue, 15 Nov 2011 01:07:21 +0000 (02:07 +0100)]
nv50/ir: Add convenience method for calculating the live sets of a function.

12 years agonv50/ir: Add support code for calculating the clobber set of a BB or function.
Francisco Jerez [Wed, 21 Mar 2012 20:43:26 +0000 (21:43 +0100)]
nv50/ir: Add support code for calculating the clobber set of a BB or function.

12 years agonv50/ir/opt: Don't lose modifiers during constant folding.
Francisco Jerez [Mon, 9 Apr 2012 18:43:28 +0000 (20:43 +0200)]
nv50/ir/opt: Don't lose modifiers during constant folding.

12 years agonv50/ir/opt: Improve modifier handling.
Francisco Jerez [Tue, 20 Mar 2012 23:39:00 +0000 (00:39 +0100)]
nv50/ir/opt: Improve modifier handling.

12 years agonv50/ir: Add support for cloning FlowInsns, ImmediateVals and BBs.
Francisco Jerez [Sat, 14 Apr 2012 19:25:22 +0000 (21:25 +0200)]
nv50/ir: Add support for cloning FlowInsns, ImmediateVals and BBs.

12 years agonv50/ir: Decouple object cloning logic from the sub-object recursion policy.
Francisco Jerez [Sat, 14 Apr 2012 19:24:16 +0000 (21:24 +0200)]
nv50/ir: Decouple object cloning logic from the sub-object recursion policy.

12 years agonv50/ir: Make sure that several IR objects are destroyed on takedown.
Francisco Jerez [Sat, 14 Apr 2012 19:23:03 +0000 (21:23 +0200)]
nv50/ir: Make sure that several IR objects are destroyed on takedown.

12 years agonv50/ir: make Instruction::src/def container private
Christoph Bumiller [Mon, 9 Apr 2012 18:40:35 +0000 (20:40 +0200)]
nv50/ir: make Instruction::src/def container private

12 years agonv50/ir: Add support for unlimited instruction arguments.
Francisco Jerez [Thu, 29 Mar 2012 19:18:24 +0000 (21:18 +0200)]
nv50/ir: Add support for unlimited instruction arguments.

12 years agonv50/ir: temporarily exclude nv50 code emitter from build
Christoph Bumiller [Thu, 29 Mar 2012 19:32:41 +0000 (21:32 +0200)]
nv50/ir: temporarily exclude nv50 code emitter from build

It's not used yet and shouldn't have been included in the first
place.

12 years agonv50/ir: copy value size in SSA-rename pass
Christoph Bumiller [Fri, 6 Apr 2012 16:37:24 +0000 (18:37 +0200)]
nv50/ir: copy value size in SSA-rename pass

12 years agonv50/ir/opt: improve post-multiply and check target for support
Christoph Bumiller [Mon, 9 Apr 2012 18:34:24 +0000 (20:34 +0200)]
nv50/ir/opt: improve post-multiply and check target for support

12 years agonv50/ir: add setFlagsDef/Src helper
Christoph Bumiller [Wed, 28 Mar 2012 21:50:32 +0000 (23:50 +0200)]
nv50/ir: add setFlagsDef/Src helper

Will be used by nv50 target.

12 years agonv50/ir: add isAccessSupported check for memory access coalescing
Christoph Bumiller [Fri, 6 Apr 2012 16:34:44 +0000 (18:34 +0200)]
nv50/ir: add isAccessSupported check for memory access coalescing

12 years agonv50/ir: add function for splitting a BasicBlock
Christoph Bumiller [Wed, 28 Mar 2012 19:30:59 +0000 (21:30 +0200)]
nv50/ir: add function for splitting a BasicBlock

Fixes to initial implementation by Francisco Jerez.

12 years agonv50/ir: Allow attaching two nodes when either one is already inside the graph.
Francisco Jerez [Tue, 15 Nov 2011 20:39:52 +0000 (21:39 +0100)]
nv50/ir: Allow attaching two nodes when either one is already inside the graph.

12 years agonv50/ir: Allow inserting isolated nodes to a graph.
Francisco Jerez [Tue, 15 Nov 2011 20:39:22 +0000 (21:39 +0100)]
nv50/ir: Allow inserting isolated nodes to a graph.

12 years agonv50/ir: Fix memory corruption in Function::orderInstructions().
Francisco Jerez [Mon, 14 Nov 2011 23:38:15 +0000 (00:38 +0100)]
nv50/ir: Fix memory corruption in Function::orderInstructions().

"iter" doesn't reference a BasicBlock directly, but a Node::Graph,
i.e. BasicBlock::get() is casting to the wrong pointer type.

12 years agonv50/ir: Fix up insertion of PHI instructions using bb->insertHead().
Francisco Jerez [Tue, 15 Nov 2011 14:58:04 +0000 (15:58 +0100)]
nv50/ir: Fix up insertion of PHI instructions using bb->insertHead().

12 years agonv50/ir: fix insertHead and remove for BBs with PHI ops only
Christoph Bumiller [Tue, 15 Nov 2011 23:39:41 +0000 (00:39 +0100)]
nv50/ir: fix insertHead and remove for BBs with PHI ops only

12 years agonv50/ir: Don't crash on zero sized BitSets.
Francisco Jerez [Sat, 19 Nov 2011 20:31:28 +0000 (21:31 +0100)]
nv50/ir: Don't crash on zero sized BitSets.

12 years agonv50/ir: Fix Interval::clear().
Francisco Jerez [Tue, 15 Nov 2011 00:50:58 +0000 (01:50 +0100)]
nv50/ir: Fix Interval::clear().

12 years agonv50/ir/tgsi: handle inferSrcType(NOT) to be u32
Christoph Bumiller [Sun, 25 Dec 2011 17:34:35 +0000 (18:34 +0100)]
nv50/ir/tgsi: handle inferSrcType(NOT) to be u32

12 years agonv50/ir/opt: Fix OP_NOT to modifier conversion.
Francisco Jerez [Mon, 14 Nov 2011 22:09:45 +0000 (23:09 +0100)]
nv50/ir/opt: Fix OP_NOT to modifier conversion.

12 years agor600g: disable dual-src hangs evergreen for some reason.
Dave Airlie [Sat, 14 Apr 2012 19:25:59 +0000 (20:25 +0100)]
r600g: disable dual-src hangs evergreen for some reason.

This did work previously, so I've broken something.

Signed-off-by: Dave Airlie <airlied@redhat.com>
12 years agor300/compiler: Exit immediately from rc_vert_fc() if there is an error
Tom Stellard [Sat, 14 Apr 2012 16:11:29 +0000 (12:11 -0400)]
r300/compiler: Exit immediately from rc_vert_fc() if there is an error

This way we correctly report "Too many temporaries" errors.

https://bugs.freedesktop.org/show_bug.cgi?id=48680

Note: This is a candidate for the stable branches.

12 years agor300/compiler: Copy all instruction attributes during local transfoms
Tom Stellard [Sat, 14 Apr 2012 14:02:19 +0000 (10:02 -0400)]
r300/compiler: Copy all instruction attributes during local transfoms

Instruction attributes like WriteALUResult and ALUResultCompare
were being discarded during the some of the local transformations.

This fixes the following piglit tests:

glsl1-inequality (vec2, pass)
loopfunc
fs-any-bvec2-using-if
fs-op-ne-bvec2-bvec2-using-if
fs-op-ne-ivec2-ivec2-using-if
fs-op-ne-mat2-mat2-using-if
fs-op-ne-vec2-vec2-using-if
fs-op-ne-mat2x3-mat2x3-using-if
fs-op-ne-mat2x4-mat2x4-using-if

https://bugs.freedesktop.org/show_bug.cgi?id=45921

NOTE: This is a candidate for the stable branches.

12 years agor300/compiler: Fix nested flow control in r500 vertex shaders
Tom Stellard [Wed, 21 Sep 2011 04:05:55 +0000 (21:05 -0700)]
r300/compiler: Fix nested flow control in r500 vertex shaders

12 years agor300/compiler: Clear loop registers in vertex shaders w/o loops
Tom Stellard [Fri, 13 Apr 2012 02:07:40 +0000 (22:07 -0400)]
r300/compiler: Clear loop registers in vertex shaders w/o loops

The loop registers weren't being cleared, so any shader that was
executed after a shader containing loops was at risk of having a loop
randomly inserted into it.

This fixes over one hundred piglit tests, although these test
only failed during full piglit runs and would pass if
run individually.  The exact number of piglit tests that this patch
fixes will vary depending on the version of piglit and the order the
tests are run.

NOTE: This is a candidate for the stable branches.

12 years agoglsl: If an "if" has no "then" or "else" code left, remove it.
Eric Anholt [Fri, 16 Mar 2012 22:44:25 +0000 (15:44 -0700)]
glsl: If an "if" has no "then" or "else" code left, remove it.

Cuts 8/1068 instructions from glyphy's fragment shaders on i965.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
12 years agoglsl: Add a helper for generating temporary variables in ir_builder.
Eric Anholt [Mon, 19 Mar 2012 23:37:23 +0000 (16:37 -0700)]
glsl: Add a helper for generating temporary variables in ir_builder.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agoglsl: Add a helper for ir_builder to make dereferences for assignments.
Eric Anholt [Mon, 19 Mar 2012 23:27:34 +0000 (16:27 -0700)]
glsl: Add a helper for ir_builder to make dereferences for assignments.

v2: Fix writemask setup for non-vec4 assignments.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agoglsl: Make a little tracking class for emitting IR lists.
Eric Anholt [Mon, 19 Mar 2012 23:01:52 +0000 (16:01 -0700)]
glsl: Make a little tracking class for emitting IR lists.

This lets us significantly shorten p->instructions->push_tail(ir), and
will be used in a few more places.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agoglsl: Add common swizzles to ir_builder.
Eric Anholt [Mon, 19 Mar 2012 21:26:04 +0000 (14:26 -0700)]
glsl: Add common swizzles to ir_builder.

Now we can fold a bunch of our expression setup in ff_fragment_shader
into single-line, parseable commits.

v2: Make it actually work.  I wasn't setting num_components in the
    mask structure, and not setting up a mask structure is way easier.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agoglsl: Let ir_builder expressions take un-dereferenced variables.
Eric Anholt [Mon, 19 Mar 2012 21:04:23 +0000 (14:04 -0700)]
glsl: Let ir_builder expressions take un-dereferenced variables.

Having to explicitly dereference is irritating and bloats the code,
when the compiler can detect and do the right thing.

v2: Use a little shim class to produce the automatic dereference
    generation at compile time as opposed to runtime, while also
    allowing compile-time type checking.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agoglsl: Create an ir_builder helper for hand-generating IR.
Eric Anholt [Mon, 19 Mar 2012 20:27:06 +0000 (13:27 -0700)]
glsl: Create an ir_builder helper for hand-generating IR.

The C++ constructors with placement new, while functional, are
extremely verbose, leading to generation of simple GLSL IR expressions
like (a * b + c * d) expanding to many lines of code and using lots of
temporary variables.  By creating a new ir_builder.h that puts simple
generators in our namespace and taking advantage of ralloc_parent(),
we can generate much more compact code, at a minor runtime cost.

v2: Replace ir_instruction usage with just ir_rvalue.
v3: Drop remaining missed as_rvalue() in v2.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agonv50,nvc0: fix handling of user vbufs with stride < access size
Christoph Bumiller [Thu, 8 Mar 2012 20:41:41 +0000 (21:41 +0100)]
nv50,nvc0: fix handling of user vbufs with stride < access size

12 years agonvc0: prefix all macro methods with MACRO
Christoph Bumiller [Tue, 28 Feb 2012 18:25:57 +0000 (19:25 +0100)]
nvc0: prefix all macro methods with MACRO

Some of them have non-macro counterparts.

12 years agonvc0: replace VERTEX_DATA push mode with translate to buffer
Christoph Bumiller [Sat, 14 Apr 2012 04:08:08 +0000 (06:08 +0200)]
nvc0: replace VERTEX_DATA push mode with translate to buffer

While pushing vertices through the FIFO is relatively fast on nv50,
it's horribly slow on nvc0.

12 years agonvc0: improve vertex state validation
Christoph Bumiller [Fri, 16 Mar 2012 16:37:32 +0000 (17:37 +0100)]
nvc0: improve vertex state validation

Now updating vertex attribute format only when necessary.

12 years agonvc0: track texture dirty state individually
Christoph Bumiller [Thu, 8 Mar 2012 14:56:11 +0000 (15:56 +0100)]
nvc0: track texture dirty state individually

12 years agonv50,nvc0: use new scratch buffers code
Christoph Bumiller [Thu, 1 Mar 2012 20:28:29 +0000 (21:28 +0100)]
nv50,nvc0: use new scratch buffers code

12 years agonouveau: add new shared scratch buffers
Christoph Bumiller [Sat, 14 Apr 2012 03:38:16 +0000 (05:38 +0200)]
nouveau: add new shared scratch buffers

12 years agonvc0: only force early fragment tests if requested by shader
Christoph Bumiller [Thu, 1 Mar 2012 20:23:06 +0000 (21:23 +0100)]
nvc0: only force early fragment tests if requested by shader

12 years agonv50,nvc0: hold references to the framebuffer surfaces
Christoph Bumiller [Wed, 7 Mar 2012 18:44:10 +0000 (19:44 +0100)]
nv50,nvc0: hold references to the framebuffer surfaces

12 years agor300g: align vertex buffer suballocations to 4
Marek Olšák [Fri, 13 Apr 2012 15:51:42 +0000 (17:51 +0200)]
r300g: align vertex buffer suballocations to 4

12 years agou_blitter: align vertex buffer suballocations to 4
Marek Olšák [Fri, 13 Apr 2012 15:51:42 +0000 (17:51 +0200)]
u_blitter: align vertex buffer suballocations to 4

12 years agodocs: document another viewperf bug in Maya-03
Brian Paul [Fri, 13 Apr 2012 20:31:16 +0000 (14:31 -0600)]
docs: document another viewperf bug in Maya-03

12 years agoxorg/nouveau: switch to libdrm_nouveau-2.0
Marcin Slusarz [Fri, 13 Apr 2012 19:55:56 +0000 (21:55 +0200)]
xorg/nouveau: switch to libdrm_nouveau-2.0

12 years agotargets/{egl-static,gbm}: further clean-up the nvfx remains
Martin Peres [Fri, 13 Apr 2012 18:53:02 +0000 (20:53 +0200)]
targets/{egl-static,gbm}: further clean-up the nvfx remains

12 years agonvc0: remove include of old libdrm_nouveau's nouveau_reloc.h
Christoph Bumiller [Sat, 14 Apr 2012 01:05:02 +0000 (03:05 +0200)]
nvc0: remove include of old libdrm_nouveau's nouveau_reloc.h

12 years agonv50,nvc0: handle PIPE_CAP_MAX_DUAL_SOURCE_RENDER_TARGETS
Christoph Bumiller [Sat, 14 Apr 2012 00:39:16 +0000 (02:39 +0200)]
nv50,nvc0: handle PIPE_CAP_MAX_DUAL_SOURCE_RENDER_TARGETS

12 years agonv30: s/DUAL_SOURCE_BLEND/MAX_DUAL_SOURCE_RENDER_TARGETS
Christoph Bumiller [Sat, 14 Apr 2012 00:38:25 +0000 (02:38 +0200)]
nv30: s/DUAL_SOURCE_BLEND/MAX_DUAL_SOURCE_RENDER_TARGETS

Merge accident.

12 years agonv30: import new driver for GeForce FX/6/7 chipsets, and Quadro variants
Ben Skeggs [Wed, 11 Jan 2012 11:42:07 +0000 (12:42 +0100)]
nv30: import new driver for GeForce FX/6/7 chipsets, and Quadro variants

The primary motivation for this rewrite was to have a maintainable driver
going forward, as nvfx was quite horrible in a lot of ways.

The driver is heavily based on the design of the nv50/nvc0 3d drivers we
already have, and uses the same common buffer/fence code.  It also passes
a HEAP more piglit tests than nvfx did, supports a couple more features,
and a few more to come still probably.

The CPU footprint of this driver is far far less than nvfx, and translates
into far greater framerates in a lot of applications (unless you're using
a CPU that's way way newer than the GPUs of these generations....)

Basically, we once again have a maintained driver for these chipsets \o/

Feel free to report bugs now!

12 years agonouveau: switch to libdrm_nouveau-2.0
Christoph Bumiller [Fri, 6 Apr 2012 13:41:55 +0000 (15:41 +0200)]
nouveau: switch to libdrm_nouveau-2.0

12 years agonvc0: remove obsolete nvc0_push2.c
Christoph Bumiller [Sun, 12 Feb 2012 23:33:55 +0000 (00:33 +0100)]
nvc0: remove obsolete nvc0_push2.c

Slower version of nvc0_push.c, was only used to ascertain that
bugs were not the new version's fault.

12 years agonouveau: remove automatic buffer migration heuristics
Christoph Bumiller [Fri, 10 Feb 2012 12:18:13 +0000 (13:18 +0100)]
nouveau: remove automatic buffer migration heuristics

12 years agonvfx: completely remove this driver (GeForce FX/6/7)
Ben Skeggs [Thu, 16 Feb 2012 12:08:41 +0000 (22:08 +1000)]
nvfx: completely remove this driver (GeForce FX/6/7)

This driver hasn't been maintained properly for a very long time, and for
many very good reasons.  It's horrible.

A new driver supporting these chipsets will appear with the commits that
port vieux/nv50/nvc0 to libdrm_nouveau-2.0.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
12 years agonouveau: rework and simplify nv04/nv05 driver a bit
Ben Skeggs [Fri, 13 Apr 2012 07:50:37 +0000 (17:50 +1000)]
nouveau: rework and simplify nv04/nv05 driver a bit

TEXTURED_TRIANGLE and MULTITEX_TRIANGLE are both a bit special in that if
you use any other graph object in the meantime they'll forget their state
and spew a lovely METHOD_CNT error at you when you try to draw.

The pre-newlib driver has a flush_notify() hook which does this state
re-emit, and a number of random workarounds like extra flushes and state
dirtying after various operations to solve this issue.

I'm taking a slightly different approach to things instead, which has the
nice side-effect of removing the divergent code-paths for ttri/mtri, the
flush/dirty workarounds and the need for flush_notify.  Also gives a few
FPS boost in OA, yay.

12 years agonouveau/vieux: switch to libdrm_nouveau-2.0
Ben Skeggs [Fri, 23 Dec 2011 04:03:49 +0000 (14:03 +1000)]
nouveau/vieux: switch to libdrm_nouveau-2.0

12 years agodocs: update GL3.txt for ARB_blend_func_extended
Dave Airlie [Fri, 13 Apr 2012 16:15:47 +0000 (17:15 +0100)]
docs: update GL3.txt for ARB_blend_func_extended

Signed-off-by: Dave Airlie <airlied@redhat.com>
12 years agogallium: document dual source blending restrictions on gallium
Dave Airlie [Fri, 13 Apr 2012 16:13:01 +0000 (17:13 +0100)]
gallium: document dual source blending restrictions on gallium

As per Brian's suggestion, document the restrictions on dual src blending.

Signed-off-by: Dave Airlie <airlied@redhat.com>
12 years agor600g: initial r600 dual src blending support
Dave Airlie [Sat, 24 Mar 2012 13:37:16 +0000 (13:37 +0000)]
r600g: initial r600 dual src blending support

survives piglit with no regressions on rv610/evergreen

Signed-off-by: Dave Airlie <airlied@redhat.com>
12 years agosoftpipe: add dual source blending support
Dave Airlie [Sat, 24 Mar 2012 13:36:59 +0000 (13:36 +0000)]
softpipe: add dual source blending support

This adds support for a single dual source blending MRT to softpipe.

Signed-off-by: Dave Airlie <airlied@redhat.com>
12 years agoutil: add dual blend helper function (v2)
Dave Airlie [Sat, 24 Mar 2012 14:28:03 +0000 (14:28 +0000)]
util: add dual blend helper function (v2)

This is just a function to tell if a certain blend mode requires dual sources.

v2: move to inlines as per Brian's suggestion

Signed-off-by: Dave Airlie <airlied@redhat.com>
12 years agost/mesa: add ARB_blend_func_extended support to state tracker.
Dave Airlie [Sat, 24 Mar 2012 13:36:17 +0000 (13:36 +0000)]
st/mesa: add ARB_blend_func_extended support to state tracker.

This adds the blend mode mapping, it also uses the var->index in the
glsl to tgsi convertor - this is the other half of my using 4 in the GLSL
compiler.

Signed-off-by: Dave Airlie <airlied@redhat.com>
12 years agogallium: rename DUAL_SOURCE_BLEND cap to MAX_DUAL_SOURCE_RENDER_TARGETS
Dave Airlie [Sat, 24 Mar 2012 13:34:45 +0000 (13:34 +0000)]
gallium: rename DUAL_SOURCE_BLEND cap to MAX_DUAL_SOURCE_RENDER_TARGETS

Though I don't think we'll ever expose > 1.

Signed-off-by: Dave Airlie <airlied@redhat.com>
12 years agoglsl: add support for ARB_blend_func_extended (v3)
Dave Airlie [Sat, 24 Mar 2012 13:33:41 +0000 (13:33 +0000)]
glsl: add support for ARB_blend_func_extended (v3)

This adds index support to the GLSL compiler.

I'm not 100% sure of my approach here, esp without how output ordering
happens wrt location, index pairs, in the "mark" function.

Since current hw doesn't ever have a location > 0 with an index > 0,
we don't have to work out if the output ordering the hw requires is
location, index, location, index or location, location, index, index.
But we have no hw to know, so punt on it for now.

v2: index requires layout - catch and error
    setup explicit index properly.

v3: drop idx_offset stuff, assume index follow location

Signed-off-by: Dave Airlie <airlied@redhat.com>
12 years agomesa: add support for ARB_blend_func_extended (v4)
Dave Airlie [Sat, 24 Mar 2012 13:33:00 +0000 (13:33 +0000)]
mesa: add support for ARB_blend_func_extended (v4)

Add implementations of the two API functions,
Add a new strings to uint mapping for index bindings
Add the blending mode validation for SRC1 + SRC_ALPHA_SATURATE
Add get for MAX_DUAL_SOURCE_DRAW_BUFFERS

v2:
Add check in valid_to_render to address case in spec ERRORS.

v3:
Add index to ir.h so this patch compiles on its own
fixup comment

v4: fixup Brian's comments

The GLSL patch will setup the indices.

Signed-off-by: Dave Airlie <airlied@redhat.com>
12 years agoradeonsi: initial WIP SI code
Tom Stellard [Fri, 6 Jan 2012 22:38:37 +0000 (17:38 -0500)]
radeonsi: initial WIP SI code

This commit adds initial support for acceleration
on SI chips.  egltri is starting to work.

The SI/R600 llvm backend is currently included in mesa
but that may change in the future.

The plan is to write a single gallium driver and
use gallium to support X acceleration.

This commit contains patches from:
Tom Stellard <thomas.stellard@amd.com>
Michel Dänzer <michel.daenzer@amd.com>
Alex Deucher <alexander.deucher@amd.com>
Vadim Girlin <vadimgirlin@gmail.com>

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The following commits were squashed in:

======================================================================

radeonsi: Remove unused winsys pointer

This was removed from r600g in commit:

commit 96d882939d612fcc8332f107befec470ed4359de
Author: Marek Olšák <maraeo@gmail.com>
Date:   Fri Feb 17 01:49:49 2012 +0100

    gallium: remove unused winsys pointers in pipe_screen and pipe_context

    A winsys is already a private object of a driver.

======================================================================

radeonsi: Copy color clamping CAPs from r600

Not sure if the values of these CAPS are correct for radeonsi, but the
same changed were made to r600g in commit:

commit bc1c8369384b5e16547c5bf9728aa78f8dfd66cc
Author: Marek Olšák <maraeo@gmail.com>
Date:   Mon Jan 23 03:11:17 2012 +0100

    st/mesa: do vertex and fragment color clamping in shaders

    For ARB_color_buffer_float. Most hardware can't do it and st/mesa is
    the perfect place for a fallback.
    The exceptions are:
    - r500 (vertex clamp only)
    - nv50 (both)
    - nvc0 (both)
    - softpipe (both)

    We also have to take into account that r300 can do CLAMPED vertex colors only,
    while r600 can do UNCLAMPED vertex colors only. The difference can be expressed
    with the two new CAPs.

======================================================================

radeonsi: Remove PIPE_CAP_OUTPUT_READ

This CAP was dropped in commit:

commit 04e324008759282728a95a1394bac2c4c2a1a3f9
Author: Marek Olšák <maraeo@gmail.com>
Date:   Thu Feb 23 23:44:36 2012 +0100

    gallium: remove PIPE_SHADER_CAP_OUTPUT_READ

    r600g is the only driver which has made use of it. The reason the CAP was
    added was to fix some piglit tests when the GLSL pass lower_output_reads
    didn't exist.

    However, not removing output reads breaks the fallback for glClampColorARB,
    which assumes outputs are not readable. The fix would be non-trivial
    and my personal preference is to remove the CAP, considering that reading
    outputs is uncommon and that we can now use lower_output_reads to fix
    the issue that the CAP was supposed to workaround in the first place.

======================================================================

radeonsi: Add missing parameters to rws->buffer_get_tiling() call

This was changed in commit:

commit c0c979eebc076b95cc8d18a013ce2968fe6311ad
Author: Jerome Glisse <jglisse@redhat.com>
Date:   Mon Jan 30 17:22:13 2012 -0500

    r600g: add support for common surface allocator for tiling v13

    Tiled surface have all kind of alignment constraint that needs to
    be met. Instead of having all this code duplicated btw ddx and
    mesa use common code in libdrm_radeon this also ensure that both
    ddx and mesa compute those alignment in the same way.

    v2 fix evergreen
    v3 fix compressed texture and workaround cube texture issue by
       disabling 2D array mode for cubemap (need to check if r7xx and
       newer are also affected by the issue)
    v4 fix texture array
    v5 fix evergreen and newer, split surface values computation from
       mipmap tree generation so that we can get them directly from the
       ddx
    v6 final fix to evergreen tile split value
    v7 fix mipmap offset to avoid to use random value, use color view
       depth view to address different layer as hardware is doing some
       magic rotation depending on the layer
    v8 fix COLOR_VIEW on r6xx for linear array mode, use COLOR_VIEW on
       evergreen, align bytes per pixel to a multiple of a dword
    v9 fix handling of stencil on evergreen, half fix for compressed
       texture
    v10 fix evergreen compressed texture proper support for stencil
        tile split. Fix stencil issue when array mode was clear by
        the kernel, always program stencil bo. On evergreen depth
        buffer bo need to be big enough to hold depth buffer + stencil
        buffer as even with stencil disabled things get written there.
    v11 rebase on top of mesa, fix pitch issue with 1d surface on evergreen,
        old ddx overestimate those. Fix linear case when pitch*height < 64.
        Fix r300g.
    v12 Fix linear case when pitch*height < 64 for old path, adapt to
        libdrm API change
    v13 add libdrm check

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
======================================================================

radeonsi: Remove PIPE_TRANSFER_MAP_PERMANENTLY

This was removed in commit:

commit 62f44f670bb0162e89fd4786af877f8da9ff607c
Author: Marek Olšák <maraeo@gmail.com>
Date:   Mon Mar 5 13:45:00 2012 +0100

    Revert "gallium: add flag PIPE_TRANSFER_MAP_PERMANENTLY"

    This reverts commit 0950086376b1c8b7fb89eda81ed7f2f06dee58bc.

    It was decided to refactor the transfer API instead of adding workarounds
    to address the performance issues.

======================================================================

radeonsi: Handle PIPE_VIDEO_CAP_PREFERED_FORMAT.

Reintroduced in commit 9d9afcb5bac2931d4b8e6d1aa571e941c5110c90.

======================================================================

radeonsi: nuke the fallback for vertex and fragment color clamping

Ported from r600g commit c2b800cf38b299c1ab1c53dc0e4ea00c7acef853.

======================================================================

radeonsi: don't expose transform_feedback2 without kernel support

Ported from r600g commit 15146fd1bcbb08e44a1cbb984440ee1a5de63d48.

======================================================================

radeonsi: Handle PIPE_CAP_GLSL_FEATURE_LEVEL.

Ported from r600g part of commit 171be755223d99f8cc5cc1bdaf8bd7b4caa04b4f.

======================================================================

radeonsi: set minimum point size to 1.0 for non-sprite non-aa points.

Ported from r600g commit f183cc9ce3ad1d043bdf8b38fd519e8f437714fc.

======================================================================

radeonsi: rework and consolidate stencilref state setting.

Ported from r600g commit a2361946e782b57f0c63587841ca41c0ea707070.

======================================================================

radeonsi: cleanup setting DB_SHADER_CONTROL.

Ported from r600g commit 3d061caaed13b646ff40754f8ebe73f3d4983c5b.

======================================================================

radeonsi: Get rid of register masks.

Ported from r600g commits
3d061caaed13b646ff40754f8ebe73f3d4983c5b..9344ab382a1765c1a7c2560e771485edf4954fe2.

======================================================================

radeonsi: get rid of r600_context_reg.

Ported from r600g commits
9344ab382a1765c1a7c2560e771485edf4954fe2..bed20f02a771f43e1c5092254705701c228cfa7f.

======================================================================

radeonsi: Fix regression from 'Get rid of register masks'.

======================================================================

radeonsi: optimize r600_resource_va.

Ported from r600g commit 669d8766ff3403938794eb80d7769347b6e52174.

======================================================================

radeonsi: remove u8,u16,u32,u64 types.

Ported from r600g commit 78293b99b23268e6698f1267aaf40647c17d95a5.

======================================================================

radeonsi: merge r600_context with r600_pipe_context.

Ported from r600g commit e4340c1908a6a3b09e1a15d5195f6da7d00494d0.

======================================================================

radeonsi: Miscellaneous context cleanups.

Ported from r600g commits
e4340c1908a6a3b09e1a15d5195f6da7d00494d0..621e0db71c5ddcb379171064a4f720c9cf01e888.

======================================================================

radeonsi: add a new simple API for state emission.

Ported from r600g commits
621e0db71c5ddcb379171064a4f720c9cf01e888..f661405637bba32c2cfbeecf6e2e56e414e9521e.

======================================================================

radeonsi: Also remove sbu_flags member of struct r600_reg.

Requires using sid.h instead of r600d.h for the new CP_COHER_CNTL definitions,
so some code needs to be disabled for now.

======================================================================

radeonsi: Miscellaneous simplifications.

Ported from r600g commits 38bf2763482b4f1b6d95cd51aecec75601d8b90f and
b0337b679ad4c2feae59215104cfa60b58a619d5.

======================================================================

radeonsi: Handle PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION.

Ported from commit 8b4f7b0672d663273310fffa9490ad996f5b914a.

======================================================================

radeonsi: Use a fake reloc to sleep for fences.

Ported from r600g commit 8cd03b933cf868ff867e2db4a0937005a02fd0e4.

======================================================================

radeonsi: adapt to get_query_result interface change.

Ported from r600g commit 4445e170bee23a3607ece0e010adef7058ac6a11.

12 years agost/vega: silence enum cast warnings
Dylan Noblesmith [Sun, 1 Apr 2012 19:47:07 +0000 (19:47 +0000)]
st/vega: silence enum cast warnings

clang warns on these:

stroker.c:626:19: warning: implicit conversion from enumeration
type 'VGPathCommand' to different enumeration type 'VGPathSegment'
[-Wconversion]

No change in the underlying value.

Reviewed-by: Brian Paul <brianp@vmware.com>
12 years agoi965: fix typo
Dylan Noblesmith [Sun, 1 Apr 2012 19:04:47 +0000 (19:04 +0000)]
i965: fix typo

Noticed by clang:

brw_wm_surface_state.c:330:30: warning: initializer overrides prior
initialization of this subobject [-Winitializer-overrides]
      [MESA_FORMAT_Z24_S8] = 0,
                             ^
brw_wm_surface_state.c:326:30: note: previous initialization is here
      [MESA_FORMAT_Z24_S8] = 0,
                             ^

No functionality change, since the array is declared static so
it was zero-initialized by default.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
12 years agomesa: fix truncated value warning
Dylan Noblesmith [Sun, 1 Apr 2012 18:59:28 +0000 (18:59 +0000)]
mesa: fix truncated value warning

Silences a clang warning:

format_pack.c:2546:30: warning: implicit conversion from 'int' to
'GLubyte' (aka 'unsigned char') changes value from 65535 to 255
[-Wconstant-conversion]
               d[i] = d[i] ? 0xffff : 0x0;
                           ~ ^~~~~~

Reviewed-by: Brian Paul <brianp@vmware.com>