Kenneth Graunke [Sun, 5 Sep 2010 07:58:34 +0000 (00:58 -0700)]
ir_reader: Only validate IR when a global 'debug' flag is set.
This extra validation is very useful when working on the built-ins, but
in general overkill - the results should stay the same unless the
built-ins or ir_validate have changed.
Also, validating all the built-in functions in every test case makes
piglit run unacceptably slow.
Marek Olšák [Sun, 5 Sep 2010 03:07:02 +0000 (05:07 +0200)]
r300g,r300c: memset the compiler struct to zeros
This should fix bogus reports "Too many temporaries." and maybe some others.
Tom Stellard [Sun, 5 Sep 2010 00:27:55 +0000 (17:27 -0700)]
r300/compiler: Remove stray break statement
This fixes glsl-fs-loop-nested.
Luca Barbieri [Sat, 4 Sep 2010 23:08:08 +0000 (01:08 +0200)]
nvfx: support unlimited constants and immediates in fp
Luca Barbieri [Sat, 4 Sep 2010 21:20:33 +0000 (23:20 +0200)]
nvfx: support using blitter to copy depth/stencil resources, fix Heaven
We might want to copy them as color ones though.
Also works around crash in Unigine Heaven due to failing to allocate
a 64 MB temporary in GART for a CPU copy.
Unigine Heaven now works on nv40, albeit with very heavy glitches (with
the floating branch with render_hdr 0).
Marek Olšák [Sat, 4 Sep 2010 22:43:34 +0000 (00:43 +0200)]
r300/compiler: fix the instruction limit in vertex shaders
Broken with commit
d774b0c710bb7d833d17bd12f5151a0176baad96.
Reported by Chris Rankin.
Luca Barbieri [Sat, 4 Sep 2010 19:29:43 +0000 (21:29 +0200)]
nvfx: support rendering to more formats
Luca Barbieri [Sat, 4 Sep 2010 18:16:54 +0000 (20:16 +0200)]
nvfx: move 2D format selection logic to 2D code
Luca Barbieri [Sat, 4 Sep 2010 15:12:02 +0000 (17:12 +0200)]
nvfx: fix swizzling of high bpp surfaces
Luca Barbieri [Sat, 4 Sep 2010 19:22:02 +0000 (21:22 +0200)]
nvfx: fix some subrectangle copies
Actually, we may want to get rid of the x/y coordinates for linear
surfaces, and realign the origin from scratch if necessary, instead
of doing this "on-demand realignment".
Luca Barbieri [Sat, 4 Sep 2010 19:21:34 +0000 (21:21 +0200)]
nvfx: fix inlinining in nv04_2d.c
Luca Barbieri [Sat, 4 Sep 2010 18:17:39 +0000 (20:17 +0200)]
nvfx: fix the temporary copying logic and add asserts
Luca Barbieri [Sat, 4 Sep 2010 15:48:19 +0000 (17:48 +0200)]
nvfx: prevent swizzled rendering into formats where it's not supported
Marek Olšák [Fri, 3 Sep 2010 22:42:36 +0000 (00:42 +0200)]
Revert "ir_to_mesa: Load all the STATE_VAR elements of a builtin uniform to a temp."
This reverts commit
5ad74779cea07cc6a19a52874cdaef8b018e2f1b.
Sorry, but I had to revert this.
Any commit which needlessly increases the number of temporaries is wrong.
More temporaries mean less shader performance because of reduced parallelism
and therefore less efficient latency hiding. In this case, there is possible
performance degradation of every shader which uses GL state variables.
I cannot accept this.
Marek Olšák [Sat, 4 Sep 2010 17:04:51 +0000 (19:04 +0200)]
Revert "r300g: refuse to create a texture with size 0"
This reverts commit
5cdedaaf295acae13ac10feeb3143d83bc53d314.
https://bugs.freedesktop.org/show_bug.cgi?id=30002
Conflicts:
src/gallium/drivers/r300/r300_texture.c
Marek Olšák [Tue, 31 Aug 2010 03:37:28 +0000 (05:37 +0200)]
r300g: remove unnecessary assignments
Marek Olšák [Fri, 3 Sep 2010 22:02:57 +0000 (00:02 +0200)]
r300/compiler: indent printed instructions according to the branch depth
Marek Olšák [Fri, 3 Sep 2010 19:43:36 +0000 (21:43 +0200)]
r300g: skip draw calls with no vertex elements, fixing hardlocks
Marek Olšák [Fri, 3 Sep 2010 18:43:48 +0000 (20:43 +0200)]
r300/compiler: use limits from the compiler input instead of inline constants
Marek Olšák [Fri, 3 Sep 2010 18:26:43 +0000 (20:26 +0200)]
r300/compiler: improve register allocation with indexable temporaries for VS
Register allocation can now reallocate temporaries right after the last indexed
source operand, instead of being disabled for the whole shader.
Marek Olšák [Thu, 2 Sep 2010 08:21:52 +0000 (10:21 +0200)]
r300/compiler: fix handling of indexed temporaries in peephole
Marek Olšák [Thu, 2 Sep 2010 08:21:23 +0000 (10:21 +0200)]
r300/compiler: disable deadcode elimination for indexed dst operands
Marek Olšák [Thu, 2 Sep 2010 05:01:36 +0000 (07:01 +0200)]
r300/compiler: allocate at least FS inputs if register allocation is disabled
Marek Olšák [Wed, 1 Sep 2010 06:12:51 +0000 (08:12 +0200)]
r300g: add a new debug option which disables compiler optimizations
Those are:
- dead-code elimination
- constant folding
- peephole (mainly copy propagation)
- register allocation
There are some bugs which I need to track down.
Also fix up the descriptions of all the debug options.
Marek Olšák [Wed, 1 Sep 2010 06:10:32 +0000 (08:10 +0200)]
r300/compiler: compute the final number of temporaries during translation
And not during the register allocation, which may be skipped for debugging
purposes. Also the predicate register is now added to the number of temps.
Marek Olšák [Wed, 1 Sep 2010 04:14:58 +0000 (06:14 +0200)]
r300/compiler: make optimizations not use 0.5 swizzles in vertex shaders
Marek Olšák [Wed, 1 Sep 2010 03:25:34 +0000 (05:25 +0200)]
r300/compiler: use peephole and constant folding for vertex shaders too
Marek Olšák [Thu, 2 Sep 2010 00:42:42 +0000 (02:42 +0200)]
r300/compiler: remove unused enum OPCODE_REPL_ALPHA
We use RC_OPCODE_REPL_ALPHA instead.
Marek Olšák [Wed, 1 Sep 2010 03:01:19 +0000 (05:01 +0200)]
r300/compiler: refactor fragment shader compilation
This cleans up the mess in r3xx_compile_fragment_program.
Marek Olšák [Wed, 1 Sep 2010 02:59:22 +0000 (04:59 +0200)]
r300/compiler: add new compiler parameter max_constants
Marek Olšák [Wed, 1 Sep 2010 01:19:05 +0000 (03:19 +0200)]
r300/compiler: refactor vertex shader compilation
First list compiler passes in an array, then run the new function rc_run_compiler.
Every backend may need a different set of passes.
This cleans up the mess in r3xx_compile_vertex_program.
Marek Olšák [Tue, 31 Aug 2010 23:55:26 +0000 (01:55 +0200)]
r300/compiler: remove a redundant parameter in rc_pair_regalloc
Marek Olšák [Tue, 31 Aug 2010 23:51:05 +0000 (01:51 +0200)]
r300/compiler: remove a redundant parameter in rc_dataflow_deadcode
&c->Base == c.
Marek Olšák [Tue, 31 Aug 2010 23:10:26 +0000 (01:10 +0200)]
r300/compiler: use null-terminated array of transformation functions
I need to reduce the number of parameters of each compiler pass function.
This is part of a larger cleanup.
Marek Olšák [Tue, 31 Aug 2010 22:59:52 +0000 (00:59 +0200)]
r300g: only check for an empty shader if there are no compile errors
Marek Olšák [Tue, 31 Aug 2010 22:56:57 +0000 (00:56 +0200)]
r300/compiler: add new compiler parameter max_alu_insts
Marek Olšák [Tue, 31 Aug 2010 18:51:37 +0000 (20:51 +0200)]
r300/compiler: put emulate_loop_state in radeon_compiler
Kenneth Graunke [Sat, 4 Sep 2010 08:09:43 +0000 (01:09 -0700)]
ir_reader: Run ir_validate on the generated IR.
It's just too easy to get something wrong in hand-written IR.
Kenneth Graunke [Sat, 4 Sep 2010 08:55:55 +0000 (01:55 -0700)]
ir_reader: Emit global variables at the top of the instruction list.
Since functions are emitted when scanning for prototypes, functions
always come first, even if the original IR listed the variable
declarations first.
Fixes an ir_validate error (to be turned on in the next commit).
Kenneth Graunke [Fri, 3 Sep 2010 23:14:40 +0000 (16:14 -0700)]
ir_reader: Drop support for reading the old assignment format.
Kenneth Graunke [Fri, 3 Sep 2010 23:21:08 +0000 (16:21 -0700)]
glsl: Regenerate autogenerated file builtin_function.cpp.
Kenneth Graunke [Fri, 3 Sep 2010 23:10:57 +0000 (16:10 -0700)]
glsl/builtins: Convert assignments to new format (with write mask).
Kenneth Graunke [Fri, 3 Sep 2010 06:54:40 +0000 (23:54 -0700)]
ir_reader: Read the new assignment format (with write mask).
This preserves the ability to read the old format, for momentary
compatibility with all the existing IR implementations of built-ins.
Kenneth Graunke [Sat, 4 Sep 2010 08:05:51 +0000 (01:05 -0700)]
ir_reader: Track the current function and report it in error messages.
Kenneth Graunke [Sat, 4 Sep 2010 07:49:23 +0000 (00:49 -0700)]
glsl/builtins: Actually print the info log if reading a builtin failed.
Luca Barbieri [Sat, 4 Sep 2010 03:30:27 +0000 (05:30 +0200)]
nvfx: consolidate tiny files
We probably want to reorganize the remaining files too, but that's
for later, maybe.
Luca Barbieri [Sat, 4 Sep 2010 03:13:06 +0000 (05:13 +0200)]
mesa/st: add missing _mesa_set_fetch_functions in st_get_tex_image
Fixes piglit fdo25614-genmipmap.
Luca Barbieri [Sat, 4 Sep 2010 02:43:02 +0000 (04:43 +0200)]
nvfx: fix vp DP2
Luca Barbieri [Sat, 4 Sep 2010 02:17:16 +0000 (04:17 +0200)]
nvfx: implement fp SSG properly
Luca Barbieri [Sat, 4 Sep 2010 01:40:49 +0000 (03:40 +0200)]
nvfx: don't claim we support preds since the driver doesn't
Luca Barbieri [Sat, 4 Sep 2010 01:35:22 +0000 (03:35 +0200)]
nv40: support all 10 texcoords
Luca Barbieri [Sat, 4 Sep 2010 01:05:28 +0000 (03:05 +0200)]
nvfx: add missing context init
Luca Barbieri [Sat, 4 Sep 2010 01:05:22 +0000 (03:05 +0200)]
nvfx: tidy up state_emit
Luca Barbieri [Sat, 4 Sep 2010 00:57:14 +0000 (02:57 +0200)]
nvfx: support all coord conventions in hardware
Luca Barbieri [Sat, 4 Sep 2010 00:37:41 +0000 (02:37 +0200)]
nvfx: add missing pushbuffer space check
Luca Barbieri [Sat, 4 Sep 2010 00:26:37 +0000 (02:26 +0200)]
nvfx: support all possible vs consts
We were incorrectly setting a register that limited the range of
constants accessible via indirect addressing.
Setting it correctly, we can address all the constants the GPU
supports.
Luca Barbieri [Sat, 4 Sep 2010 00:05:14 +0000 (02:05 +0200)]
nvfx: set magic bit to round NPOT mipmap sizes down and not up
Does any API even use rounding-up?
Luca Barbieri [Fri, 3 Sep 2010 21:27:49 +0000 (23:27 +0200)]
nvfx: allow nested blitter usage, fixing bug in clear
Brian Paul [Fri, 3 Sep 2010 22:35:07 +0000 (16:35 -0600)]
galahad: do map/unmap counting for resources
Brian Paul [Fri, 3 Sep 2010 22:33:17 +0000 (16:33 -0600)]
libgl-xlib: enable galahad support
If the GALLIUM_GALAHAD env var is 1 we'll wrap the regular driver with
the galahad validation driver.
Brian Paul [Fri, 3 Sep 2010 22:25:44 +0000 (16:25 -0600)]
scons: added galahad to driver list
Brian Paul [Fri, 3 Sep 2010 21:57:48 +0000 (15:57 -0600)]
mesa: also build galahad driver
Brian Paul [Fri, 3 Sep 2010 21:25:50 +0000 (15:25 -0600)]
exec_list: replace class with struct
To match the definition below.
Brian Paul [Fri, 3 Sep 2010 20:39:43 +0000 (14:39 -0600)]
mesa: fix up a comment
Brian Paul [Thu, 2 Sep 2010 20:11:53 +0000 (14:11 -0600)]
st/glx: added some comments
Luca Barbieri [Fri, 3 Sep 2010 20:06:41 +0000 (22:06 +0200)]
nvfx: implement LIT in fp
Ian Romanick [Thu, 2 Sep 2010 21:53:17 +0000 (14:53 -0700)]
glsl2: Use as_constant some places instead of constant_expression_value
The places where constant_expression_value are still used in loop
analysis are places where a new expression tree is created and
constant folding won't have happened. This is used, for example, when
we try to determine the maximal loop iteration count.
Based on review comments by Eric. "...rely on constant folding to
have done its job, instead of going all through the subtree again when
it wasn't a constant."
Ian Romanick [Fri, 27 Aug 2010 23:22:36 +0000 (16:22 -0700)]
glsl2: Allow copy / constant propagation into array indices
Ian Romanick [Fri, 27 Aug 2010 20:59:49 +0000 (13:59 -0700)]
glsl2: Add module to perform simple loop unrolling
Ian Romanick [Fri, 27 Aug 2010 22:41:20 +0000 (15:41 -0700)]
glsl2: Track the number of ir_loop_jump instructions that are in a loop
Ian Romanick [Fri, 27 Aug 2010 20:53:25 +0000 (13:53 -0700)]
ir_expression: Add static operator_string method
I've used this in quite a few debug commits that never reached an
up-stream tree.
Ian Romanick [Fri, 27 Aug 2010 20:53:56 +0000 (13:53 -0700)]
exec_node: Add insert_before that inserts an entire list
Ian Romanick [Fri, 27 Aug 2010 18:26:08 +0000 (11:26 -0700)]
glsl2: Eliminate zero-iteration loops
Ian Romanick [Thu, 26 Aug 2010 23:45:22 +0000 (16:45 -0700)]
glsl2: Perform initial bits of loop analysis during compilation
Ian Romanick [Thu, 26 Aug 2010 23:43:57 +0000 (16:43 -0700)]
glsl2: Add module to suss out loop control variables from loop analysis data
This is the next step on the road to loop unrolling
Ian Romanick [Thu, 26 Aug 2010 22:58:33 +0000 (15:58 -0700)]
glsl2: Add module to analyze variables used in loops
This is the first step eventually leading to loop unrolling.
Ian Romanick [Thu, 26 Aug 2010 22:49:33 +0000 (15:49 -0700)]
ir_to_mesa: Handle loops with loop controls set
The downside of our talloc usage is that we can't really make static
(i.e., not created with new) instances of our IR types. This leads to
a lot of unnecessary dynamic allocation in this patch.
Ian Romanick [Thu, 26 Aug 2010 22:22:06 +0000 (15:22 -0700)]
ir_validate: Validate loop control fields in ir_loop
Ian Romanick [Thu, 26 Aug 2010 22:11:26 +0000 (15:11 -0700)]
glsl2: Add cmp field to ir_loop
This reprents the type of comparison between the loop induction
variable and the loop termination value.
Ian Romanick [Thu, 5 Aug 2010 22:29:24 +0000 (15:29 -0700)]
glsl2: Set a flag when visiting the assignee of an assignment
Ian Romanick [Tue, 17 Aug 2010 01:02:11 +0000 (18:02 -0700)]
exec_list: Add pop_head
Ian Romanick [Thu, 12 Aug 2010 21:55:48 +0000 (14:55 -0700)]
ir_print_visitor: Print empty else blocks more compactly
Luca Barbieri [Fri, 3 Sep 2010 18:57:44 +0000 (20:57 +0200)]
nvfx: fix division by zero in vp-ignore-input
Luca Barbieri [Fri, 3 Sep 2010 18:54:48 +0000 (20:54 +0200)]
nvfx: report correct max lodbias
Fixes piglit lodbias
Luca Barbieri [Fri, 3 Sep 2010 18:36:29 +0000 (20:36 +0200)]
nvfx: remove message
Luca Barbieri [Fri, 3 Sep 2010 16:31:18 +0000 (18:31 +0200)]
nvfx: support indirect addressing in vps
Negative or huge offsets not yet supported.
Alex Deucher [Fri, 3 Sep 2010 16:13:47 +0000 (12:13 -0400)]
r600c: add proper returns for some evergreen functions
these weren't checked anyway.
Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=29999
Luca Barbieri [Fri, 3 Sep 2010 13:44:27 +0000 (15:44 +0200)]
nvfx: fix support for more than 8 texture units (fixes etqw crash)
Dave Airlie [Fri, 3 Sep 2010 09:37:52 +0000 (19:37 +1000)]
r600g: fix segfault in state after rework
probably can improve this a bit.
Alex Deucher [Fri, 3 Sep 2010 05:13:41 +0000 (01:13 -0400)]
r600c: emit DB_HTILE_DATA_BASE on evergreen
Make the hw happy.
Dave Airlie [Fri, 3 Sep 2010 04:12:38 +0000 (14:12 +1000)]
r600g: refactor sample states into a reusable struct.
I will not cut-n-paste.
I will not cut-n-paste.
I will not cut-n-paste.
Dave Airlie [Fri, 3 Sep 2010 03:53:39 +0000 (13:53 +1000)]
r600g: reduce size of r600 context structure to !insane
Its now about 7.8k, and might actually fit in a cache.
Dave Airlie [Fri, 3 Sep 2010 03:07:40 +0000 (13:07 +1000)]
r600g: add texture border state.
Okay I finally wrapped my head around what r600_context_state is meant to be,
maybe I should just rename all the structs so that have distinct names.
I've no idea however why 16 is a good magic number for R600_MAX_RSTATE.
Dave Airlie [Fri, 3 Sep 2010 02:01:59 +0000 (12:01 +1000)]
r600g: deref old driver states for set entry points.
Dave Airlie [Fri, 3 Sep 2010 01:55:36 +0000 (11:55 +1000)]
r600g: drop r600_bind_state.
This was another ugly function that really wasn't needed.
The 3 calls to it from the gallium api were shorter than it,
and all the calls from the set_ functions were pointless.
Dave Airlie [Fri, 3 Sep 2010 01:35:08 +0000 (11:35 +1000)]
r600g: kill r600_context_state function
having some sort of locality of code really matters, just create
and setup state at time. Not sure if this is just further polishing of a bad thing,
but at least it makes it more readable.
Dave Airlie [Fri, 3 Sep 2010 00:55:02 +0000 (10:55 +1000)]
r600g: move lots of state inline helpers to separate header.
this gets them out of sight of the main codeflow.
Vinson Lee [Fri, 3 Sep 2010 00:00:53 +0000 (17:00 -0700)]
draw: Include missing headers in draw_vs_aos.h.
Include tgsi_exec.h for TGSI_EXEC_NUM_TEMPS.
Include draw_vs.h for draw_vs_varient.
Dave Airlie [Thu, 2 Sep 2010 23:39:04 +0000 (09:39 +1000)]
r600g: drop magic numbers in depth state.
this also fixes occulsion queries.
Vinson Lee [Thu, 2 Sep 2010 23:30:34 +0000 (16:30 -0700)]
util: Include missing header in u_linear.h.
Include p_compiler.h for size_t and boolean symbols.