Luca Barbieri [Tue, 14 Sep 2010 03:10:59 +0000 (05:10 +0200)]
mesa/st: ask GLSL to not emit noise since we have a dummy implementation
Note, BTW, that the Gallium implementation returns 0.5, which seems
to violate the GLSL spec, where it should return 0.0 instead.
Not sure whether changing it to 0 is correct or not.
Luca Barbieri [Mon, 6 Sep 2010 00:31:20 +0000 (02:31 +0200)]
mesa/st: set compiler options based on Gallium shader caps
This turns on if conversion and unlimited loop unrolling if control
flow is not supported.
NOTE: this will change the behavior of r300g and any other driver
that doesn't advertise control flow
Luca Barbieri [Sun, 5 Sep 2010 18:50:50 +0000 (20:50 +0200)]
gallium: introduce get_shader_param (ALL DRIVERS CHANGED) (v3)
Changes in v3:
- Also change trace, which I forgot about
Changes in v2:
- No longer adds tessellation shaders
Currently each shader cap has FS and VS versions.
However, we want a version of them for geometry, tessellation control,
and tessellation evaluation shaders, and want to be able to easily
query a given cap type for a given shader stage.
Since having 5 duplicates of each shader cap is unmanageable, add
a new get_shader_param function that takes both a shader cap from a
new enum and a shader stage.
Drivers with non-unified shaders will first switch on the shader
and, within each case, switch on the cap.
Drivers with unified shaders instead first check whether the shader
is supported, and then switch on the cap.
MAX_CONST_BUFFERS is now per-stage.
The geometry shader cap is removed in favor of checking whether the
limit of geometry shader instructions is greater than 0, which is also
used for tessellation shaders.
WARNING: all drivers changed and compiled but only nvfx tested
Ian Romanick [Tue, 14 Sep 2010 00:53:32 +0000 (17:53 -0700)]
glsl2: Port equal() and notEqual() to ir_unop_all_equal and ir_unop_any_nequal
Luca Barbieri [Tue, 7 Sep 2010 23:31:39 +0000 (01:31 +0200)]
glsl: introduce ir_binop_all_equal and ir_binop_any_equal, allow vector cmps
Currently GLSL IR forbids any vector comparisons, and defines "ir_binop_equal"
and "ir_binop_nequal" to compare all elements and give a single bool.
This is highly unintuitive and prevents generation of optimal Mesa IR.
Hence, first rename "ir_binop_equal" to "ir_binop_all_equal" and
"ir_binop_nequal" to "ir_binop_any_nequal".
Second, readd "ir_binop_equal" and "ir_binop_nequal" with the same semantics
as less, lequal, etc.
Third, allow all comparisons to acts on vectors.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Luca Barbieri [Tue, 7 Sep 2010 15:03:43 +0000 (17:03 +0200)]
loop_unroll: unroll loops with (lowered) breaks
If the loop ends with an if with one break or in a single break unroll
it. Loops that end with a continue will have that continue removed by
the redundant jump optimizer. Likewise loops that end with an
if-statement with a break at the end of both branches will have the
break pulled out after the if-statement.
Loops of the form
for (...) {
do_something1();
if (cond) {
do_something2();
break;
} else {
do_something3();
}
}
will be unrolled as
do_something1();
if (cond) {
do_something2();
} else {
do_something3();
do_something1();
if (cond) {
do_something2();
} else {
do_something3();
/* Repeat inserting iterations here.*/
}
}
ir_lower_jumps can guarantee that all loops are put in this form
and thus all loops are now potentially unrollable if an upper bound
on the number of iterations can be found.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Ian Romanick [Mon, 13 Sep 2010 21:25:26 +0000 (14:25 -0700)]
glsl2: Add pass to remove redundant jumps
Ian Romanick [Mon, 13 Sep 2010 20:46:29 +0000 (13:46 -0700)]
glsl: Explain file naming convention
Luca Barbieri [Tue, 7 Sep 2010 15:02:37 +0000 (17:02 +0200)]
loop_controls: fix analysis of already analyzed loops
The loop_controls pass didn't look at the counter values it put in ir_loop
on previous iterations, so while the first iteration worked, subsequent
ones couldn't determine max_iterations.
Ian Romanick [Mon, 13 Sep 2010 18:05:05 +0000 (11:05 -0700)]
i965: Request that returns be lowered in shader main
Fixes piglit tests glsl-vs-main-return and glsl-fs-main-return.
Luca Barbieri [Tue, 7 Sep 2010 00:15:26 +0000 (02:15 +0200)]
glsl: call ir_lower_jumps according to compiler options
Luca Barbieri [Mon, 6 Sep 2010 22:24:08 +0000 (00:24 +0200)]
glsl: add continue/break/return unification/elimination pass (v2)
Changes in v2:
- Base class renamed to ir_control_flow_visitor
- Tried to comply with coding style
This is a new pass that supersedes ir_if_return and "lowers" jumps
to if/else structures.
Currently it causes no regressions on softpipe and nv40, but I'm not sure
whether the piglit glsl tests are thorough enough, so consider this
experimental.
It can be asked to:
1. Pull jumps out of ifs where possible
2. Remove all "continue"s, replacing them with an "execute flag"
3. Replace all "break" with a single conditional one at the end of the loop
4. Replace all "return"s with a single return at the end of the function,
for the main function and/or other functions
This gives several great benefits:
1. All functions can be inlined after this pass
2. nv40 and other pre-DX10 chips without "continue" can be supported
3. nv30 and other pre-DX10 chips with no control flow at all are better supported
Note that for full effect we should also teach the unroller to unroll
loops with a fixed maximum number of iterations but with the canonical
conditional "break" that this pass will insert if asked to.
Continues are lowered by adding a per-loop "execute flag", initialized to
TRUE, that when cleared inhibits all execution until the end of the loop.
Breaks are lowered to continues, plus setting a "break flag" that is checked
at the end of the loop, and trigger the unique "break".
Returns are lowered to breaks/continues, plus adding a "return flag" that
causes loops to break again out of their enclosing loops until all the
loops are exited: then the "execute flag" logic will ignore everything
until the end of the function.
Note that "continue" and "return" can also be implemented by adding
a dummy loop and using break.
However, this is bad for hardware with limited nesting depth, and
prevents further optimization, and thus is not currently performed.
Luca Barbieri [Mon, 6 Sep 2010 22:22:34 +0000 (00:22 +0200)]
glsl: add ir_control_flow_visitor
This is just a subclass of ir_visitor with empty implementations of all
the visit methods for non-control flow nodes.
Used to avoid duplicating that in ir_visitor subclasses.
ir_hierarchical_visitor is another way to solve this, but is less natural
for some applications.
José Fonseca [Mon, 13 Sep 2010 19:43:36 +0000 (20:43 +0100)]
llvmpipe: Fix non SSE2 builds.
Should fix fdo 30168.
Marek Olšák [Mon, 13 Sep 2010 19:08:48 +0000 (21:08 +0200)]
r300g/swtcl: unlock VBO after draw_flush
https://bugs.freedesktop.org/show_bug.cgi?id=29901
https://bugs.freedesktop.org/show_bug.cgi?id=30132
Witold Baryluk [Mon, 13 Sep 2010 17:57:35 +0000 (18:57 +0100)]
llvmpipe: Change asm to __asm__.
According to gcc documentation both are equivalent,
second are prefered as first can make conflict with existing symbols.
Signed-off-by: José Fonseca <jfonseca@vmware.com>
Jesse Barnes [Mon, 13 Sep 2010 17:55:16 +0000 (10:55 -0700)]
EGL DRI2: 0xa011 is Pineview not Ironlake
Point about needing a better way to do this validated.
Alex Deucher [Mon, 13 Sep 2010 17:36:19 +0000 (13:36 -0400)]
r600c: const buffer sizes must be a multiple of 16 consts
This applies to r6xx/r7xx/evergreen
Jesse Barnes [Mon, 13 Sep 2010 17:35:45 +0000 (10:35 -0700)]
EGL DRI2: add PCI ID for Ironlake mobile
Allows KMS EGL driver to load. We need a better way of doing this.
Alex Deucher [Mon, 13 Sep 2010 16:16:00 +0000 (12:16 -0400)]
r600c/eg: remove obselete comment
Alex Deucher [Mon, 13 Sep 2010 16:14:24 +0000 (12:14 -0400)]
r600c/eg: remove unused emit timestamp function
Alex Deucher [Mon, 13 Sep 2010 16:11:29 +0000 (12:11 -0400)]
r600c/eg: emit CB_BLEND_ALPHA with the other blend values
saves a few dwords
Alex Deucher [Mon, 13 Sep 2010 16:06:34 +0000 (12:06 -0400)]
r600c: remove redundant state emit on evergreen
r700start3d already emits the context control packets
Kristian Høgsberg [Mon, 13 Sep 2010 14:31:45 +0000 (10:31 -0400)]
mesa: Revert accidentally committed vertex code chunk
Andre Maasikas [Mon, 13 Sep 2010 13:55:09 +0000 (16:55 +0300)]
r600c: eg: fix typo
probably copy/paste error
Andre Maasikas [Mon, 13 Sep 2010 13:28:16 +0000 (16:28 +0300)]
r600c: eg: 256 float4 constants may need more than 256 bytes
Andre Maasikas [Mon, 13 Sep 2010 13:17:10 +0000 (16:17 +0300)]
r600c: eg - fix uninitialized variable
Kristian Høgsberg [Mon, 13 Sep 2010 12:39:42 +0000 (08:39 -0400)]
glx: Don't destroy DRI2 drawables for legacy glx drawables
For GLX 1.3 drawables, we can destroy the DRI2 drawable when the GLX
drawable is destroyed. However, for legacy drawables, there os no
good way of knowing when the application is done with it, so we just
let the DRI2 drawable linger on the server. The server will destroy
the DRI2 drawable when it destroys the X drawable or the client exits
anyway.
https://bugs.freedesktop.org/show_bug.cgi?id=30109
Marek Olšák [Mon, 13 Sep 2010 10:58:19 +0000 (12:58 +0200)]
r300g: fix SWTCL
https://bugs.freedesktop.org/show_bug.cgi?id=29901
José Fonseca [Mon, 13 Sep 2010 11:03:07 +0000 (12:03 +0100)]
llvmpipe: Unbreak rasterization on 64bit.
José Fonseca [Mon, 13 Sep 2010 10:14:54 +0000 (11:14 +0100)]
gallium: Change the resource_copy_region semantics to allow copies between different yet compatible formats
Dave Airlie [Mon, 13 Sep 2010 08:50:12 +0000 (18:50 +1000)]
r600g: evergreen fixup dsa state for running query.
evergreen is always the same as r700 here.
Andre Maasikas [Mon, 13 Sep 2010 09:21:18 +0000 (12:21 +0300)]
r600c: remove stray unmap call
no idea how/why it got there
José Fonseca [Mon, 13 Sep 2010 08:24:09 +0000 (09:24 +0100)]
llvmpipe: use gcc asm only with gcc
Marek Olšák [Mon, 13 Sep 2010 07:54:46 +0000 (09:54 +0200)]
r300g: print unassigned FS inputs for DBG_RS
Marek Olšák [Mon, 13 Sep 2010 05:44:32 +0000 (07:44 +0200)]
r300g: fix map_buffer
https://bugs.freedesktop.org/show_bug.cgi?id=30145
Marek Olšák [Mon, 13 Sep 2010 05:51:47 +0000 (07:51 +0200)]
r300/compiler: fix warnings
Marek Olšák [Fri, 10 Sep 2010 07:18:03 +0000 (09:18 +0200)]
r300g: add new debug options for dumping scissor regs and disabling CBZB clear
Marek Olšák [Fri, 10 Sep 2010 05:58:07 +0000 (07:58 +0200)]
r300g: skip rendering if CS space validation fails
radeon_cs_space_check flushes the pipe context on failure, retries
the validation, and returns -1 if it fails again. At that point, there is
nothing we can do, so let's skip draw operations instead of getting stuck
in an infinite loop.
This code path ideally should never be hit.
Marek Olšák [Fri, 10 Sep 2010 05:53:47 +0000 (07:53 +0200)]
r300g: remove u_upload_flush from r300_draw_arrays
This a leftover probably and is unnecessary, since we flush u_upload_mgr
in r300_flush.
Vinson Lee [Mon, 13 Sep 2010 04:48:40 +0000 (21:48 -0700)]
nvfx: Remove unused variables.
Vinson Lee [Mon, 13 Sep 2010 04:39:21 +0000 (21:39 -0700)]
nvfx: Move declaration before code.
Fixes SCons build.
Keith Whitwell [Tue, 7 Sep 2010 22:13:31 +0000 (23:13 +0100)]
llvmpipe: introduce tri_3_4 for tiny triangles
Keith Whitwell [Tue, 7 Sep 2010 22:10:11 +0000 (23:10 +0100)]
llvmpipe: allow tri_3_16 at any 4-aligned location within a tile
Doesn't require 16-alignment, so catch more cases.
Keith Whitwell [Tue, 7 Sep 2010 22:06:57 +0000 (23:06 +0100)]
llvmpipe: refactor tri_3_16
Keep step array as a set of four m128i's and reuse throughout the
rasterization.
Keith Whitwell [Tue, 7 Sep 2010 06:55:28 +0000 (07:55 +0100)]
llvmpipe: pass linear masks to fragment shader
Fragment shader can extract the correct bits for each quad.
Keith Whitwell [Sun, 12 Sep 2010 14:01:41 +0000 (15:01 +0100)]
llvmpipe: fix warnings on both 32 and 64 bit builds
Keith Whitwell [Sun, 12 Sep 2010 13:29:00 +0000 (14:29 +0100)]
llvmpipe: fix wierd performance regression in isosurf
I really don't understand the mechanism behind this, but it
seems like the way data blocks for a scene are malloced, and in
particular whether we treat them as stack or a queue, and whether
we retain the most recently allocated or least recently allocated
has a real affect (~5%) on isosurf framerates...
This is probably specific to my distro or even just my machine,
but none the less, it's nicer not to see the framerates go in the
wrong direction.
José Fonseca [Sun, 12 Sep 2010 09:34:53 +0000 (10:34 +0100)]
pb: Fix the build, and add notes.
José Fonseca [Sun, 12 Sep 2010 09:14:50 +0000 (10:14 +0100)]
llvmpipe: Only generate the whole shader specialization for opaque shaders.
If not opaque, then the color buffer will have to be read any way,
therefore the specialization is pointless.
Dave Airlie [Sat, 28 Aug 2010 08:59:32 +0000 (18:59 +1000)]
pb: add void * for flush ctx to mapping functions
If the buffer we are attempting to map is referenced by the unsubmitted
command stream for this context, we need to flush the command stream,
however to do that we need to be able to access the context at the lowest
level map function, currently we set the buffer in the toplevel map, but this
racy between context. (we probably have a lot more issues than that.)
I'll look into a proper solution as suggested by jrfonseca when I get some time.
Luca Barbieri [Sat, 11 Sep 2010 19:11:03 +0000 (21:11 +0200)]
nv30: fix breakage due to 10 texcoord support on nv40
Chia-I Wu [Sat, 11 Sep 2010 18:20:39 +0000 (02:20 +0800)]
Add missing files to the tarball file lists.
Chia-I Wu [Sat, 11 Sep 2010 14:07:59 +0000 (22:07 +0800)]
mesa: Fix depend.es[12] generation when LLVM is enabled.
"llvm-config --cflags" outputs -f options, which conflict makedepend.
Clean up compiler flags and append LLVM_CFLAGS to the new xxx_CFLAGS
instead of xxx_CPPFLAGS, where xxx may be MESA, ES1, or ES2.
Tilman Sauerbeck [Sat, 11 Sep 2010 10:00:10 +0000 (12:00 +0200)]
r600g: Undo bo placement change.
This reverts a part of
e795ca8f3175fa6fd97b6b2ef2775e3f8803012a
that causes artefacts and a performance drop.
Signed-off-by: Tilman Sauerbeck <tilman@code-monkey.de>
José Fonseca [Sat, 11 Sep 2010 12:47:58 +0000 (13:47 +0100)]
llvmpipe: Silence some warnings.
José Fonseca [Fri, 10 Sep 2010 19:09:00 +0000 (20:09 +0100)]
gallivm: nr_channels is only valid for formats with plain layout.
This is erroneously throwing non plain formats out of the faster
AoS sampling path.
Doing 8bit interpolation for single channels such as L8 should be no
worse than with floating point. But this may need more investigation.
José Fonseca [Fri, 10 Sep 2010 15:37:11 +0000 (16:37 +0100)]
gallivm: Use const keyword on swizzles.
José Fonseca [Fri, 10 Sep 2010 14:00:21 +0000 (15:00 +0100)]
gallivm: Allow to TGSI AoS translation to happen in BGRA ordering.
Or any ordering.
José Fonseca [Fri, 10 Sep 2010 13:59:11 +0000 (14:59 +0100)]
llvmpipe: Don't store display the alpha ref value in the key.
It's never used.
José Fonseca [Thu, 9 Sep 2010 11:09:44 +0000 (12:09 +0100)]
gallivm: Add a new debug flag to warn about performance issues.
José Fonseca [Fri, 3 Sep 2010 10:53:48 +0000 (11:53 +0100)]
gallivm: Helper functions for pointer indirection.
José Fonseca [Fri, 3 Sep 2010 09:54:41 +0000 (10:54 +0100)]
gallivm: Cleanup the TGSI <-> sampler interface.
José Fonseca [Fri, 3 Sep 2010 09:53:39 +0000 (10:53 +0100)]
gallivm: Add some utility functions to set/get array elements too.
José Fonseca [Thu, 2 Sep 2010 11:45:50 +0000 (12:45 +0100)]
gallivm: Basic AoS TGSI -> LLVM IR.
Essentially a variation of the SoA version.
José Fonseca [Thu, 2 Sep 2010 11:14:39 +0000 (12:14 +0100)]
gallivm: Move the texture modifiers to the header.
Useful to pass these around.
José Fonseca [Thu, 2 Sep 2010 11:13:46 +0000 (12:13 +0100)]
gallivm: s/lp_build_broadcast_aos/lp_build_swizzle_scalar_aos/
More accurate description of this function purpose.
Alex Corscadden [Wed, 8 Sep 2010 23:59:03 +0000 (16:59 -0700)]
Add a test for the KIL opcode
This is a simple test for the KIL opcode. It should render a 6 sided figure
with a colored interior.
Keith Whitwell [Sat, 11 Sep 2010 09:04:34 +0000 (10:04 +0100)]
llvmpipe: restore larger command blocks
Keith Whitwell [Wed, 8 Sep 2010 17:46:39 +0000 (18:46 +0100)]
llvmpipe: move some debug to DEBUG_SCENE
Keith Whitwell [Wed, 8 Sep 2010 17:37:45 +0000 (18:37 +0100)]
llvmpipe: add DEBUG_MEM option
Keith Whitwell [Tue, 7 Sep 2010 22:54:09 +0000 (23:54 +0100)]
llvmpipe: allow bigger scenes
Tom Stellard [Fri, 10 Sep 2010 02:13:57 +0000 (19:13 -0700)]
r300/compiler: Reorganize presub_helper()
Tom Stellard [Thu, 9 Sep 2010 17:19:52 +0000 (10:19 -0700)]
r300/compiler: Don't use presubtract in TEX instructions
Tom Stellard [Tue, 7 Sep 2010 17:23:30 +0000 (10:23 -0700)]
r300/compiler: Print the presub subtract operation in the correct order
Tom Stellard [Tue, 7 Sep 2010 17:22:16 +0000 (10:22 -0700)]
r300/compiler: Fix dataflow bug in presub_helper()
Tom Stellard [Tue, 7 Sep 2010 03:48:10 +0000 (20:48 -0700)]
r300/compiler: Replace asserts with error messages
Tom Stellard [Mon, 6 Sep 2010 22:31:07 +0000 (15:31 -0700)]
r300/compiler: Fix copy propigation for some presub instructions
Tom Stellard [Mon, 6 Sep 2010 17:57:20 +0000 (10:57 -0700)]
r300/compiler: Add peephole optimization for the 'sub' presubtract operation
Tom Stellard [Mon, 30 Aug 2010 15:59:30 +0000 (08:59 -0700)]
r300/compiler: Add peephole optimization for the 'add' presubtract operation
Tom Stellard [Sun, 5 Sep 2010 02:10:23 +0000 (19:10 -0700)]
r300/compiler: Clean up rc_pair_alloc_source()
Tom Stellard [Wed, 14 Jul 2010 04:25:27 +0000 (21:25 -0700)]
r300/compiler: Enable presubtract sources
The r300 compiler can now emit instructions that select from the presubtract
source. A peephole optimization has been added to convert instructions like:
ADD Temp[0].x, none.1, -Temp[1].x into the INV (1 - src0) presubtract
operation.
Ian Romanick [Fri, 10 Sep 2010 20:10:26 +0000 (13:10 -0700)]
mesa: Remove unused Emit flags from gl_shader_compiler_options
Ian Romanick [Thu, 9 Sep 2010 23:27:37 +0000 (16:27 -0700)]
intel: Remove noise opcode support from i915 and i965 drivers
With recent changes to the GLSL compiler, these opcode should never be
seen in these drivers.
Alex Deucher [Fri, 10 Sep 2010 18:14:12 +0000 (14:14 -0400)]
r600c: add missing header
Alex Deucher [Fri, 10 Sep 2010 17:26:10 +0000 (13:26 -0400)]
r600c: add OQ support for evergreen
Alex Deucher [Fri, 10 Sep 2010 17:13:08 +0000 (13:13 -0400)]
r600c: oq updates
Alex Deucher [Fri, 10 Sep 2010 16:54:44 +0000 (12:54 -0400)]
r600c: add blit support for evergreen
driver was previously calling the r600 blit code
which won't work on evergreen.
Alex Deucher [Fri, 10 Sep 2010 15:40:46 +0000 (11:40 -0400)]
r600c: emit start3d packet on evergreen
Alex Deucher [Fri, 10 Sep 2010 01:16:55 +0000 (21:16 -0400)]
r600c: fix some typos
Alex Deucher [Fri, 10 Sep 2010 00:36:23 +0000 (20:36 -0400)]
r600c: fix type in cb setup on evergreen
Alex Deucher [Fri, 10 Sep 2010 00:26:11 +0000 (20:26 -0400)]
r600c: add support for more rendering formats on evergreen
Andre Maasikas [Fri, 10 Sep 2010 11:41:33 +0000 (14:41 +0300)]
r600: set correct initial point_minmax values
Tilman Sauerbeck [Thu, 9 Sep 2010 19:33:37 +0000 (21:33 +0200)]
r600g: Fixed a bo reference leak in the draw module.
Signed-off-by: Tilman Sauerbeck <tilman@code-monkey.de>
Tilman Sauerbeck [Thu, 9 Sep 2010 13:24:50 +0000 (15:24 +0200)]
r600g: Only increase a bo's map_count if radeon_bo_map() succeeded.
Signed-off-by: Tilman Sauerbeck <tilman@code-monkey.de>
Tilman Sauerbeck [Thu, 9 Sep 2010 12:57:32 +0000 (14:57 +0200)]
r600g: Fixed a bo leak in the error path of radeon_ctx_set_bo_new().
Signed-off-by: Tilman Sauerbeck <tilman@code-monkey.de>
Tilman Sauerbeck [Thu, 9 Sep 2010 12:03:46 +0000 (14:03 +0200)]
r600g: Fixed a bo leak in r600_texture_from_handle().
We would leak bo if the argument check failed.
Signed-off-by: Tilman Sauerbeck <tilman@code-monkey.de>
Tilman Sauerbeck [Thu, 9 Sep 2010 11:51:51 +0000 (13:51 +0200)]
r600g: Don't leave stale references in query_list when we cannot create bo.
Signed-off-by: Tilman Sauerbeck <tilman@code-monkey.de>
Tilman Sauerbeck [Wed, 8 Sep 2010 09:21:21 +0000 (11:21 +0200)]
r600g: Implemented the y component write for the LOG opcode.
This makes the 'vp1-LOG test' piglit test work.
Signed-off-by: Tilman Sauerbeck <tilman@code-monkey.de>
Chia-I Wu [Fri, 10 Sep 2010 10:26:03 +0000 (18:26 +0800)]
egl: Simplify _eglBindContext.
Remove the hard-to-get-right _eglBindContextToSurfaces. As well as fix
an assertion failure from
b90a3e7d8b1bcd412ddbf2a4803de2756dacd436 when
such call sequence is hit
eglMakeCurrent(dpy, surf1, surf1, ctx1);
eglMakeCurrent(dpy, surf2, surf2, ctx2);
eglMakeCurrent(dpy, surf1, surf1, ctx1);