Paul Berry [Wed, 8 Jan 2014 19:40:23 +0000 (11:40 -0800)]
glsl/cs: Prohibit mixing of compute and non-compute shaders.
Fixes piglit test:
spec/ARB_compute_shader/linker/mix_compute_and_non_compute
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Paul Berry [Wed, 8 Jan 2014 09:54:26 +0000 (01:54 -0800)]
glsl/cs: Prohibit user-defined ins/outs in compute shaders.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Paul Berry [Thu, 9 Jan 2014 12:03:30 +0000 (04:03 -0800)]
main/cs: Implement query for COMPUTE_WORK_GROUP_SIZE.
v2: Improve error message.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Paul Berry [Wed, 8 Jan 2014 19:59:28 +0000 (11:59 -0800)]
mesa/cs: Handle compute shader local size during linking.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Paul Berry [Mon, 6 Jan 2014 17:09:31 +0000 (09:09 -0800)]
glsl/cs: Handle compute shader local_size_{x,y,z} declaration.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Paul Berry [Wed, 8 Jan 2014 09:42:58 +0000 (01:42 -0800)]
mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_COUNT constant.
v2: Document that the 3-element array MaxComputeWorkGroupCount is
indexed by dimension.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Paul Berry [Mon, 6 Jan 2014 23:11:40 +0000 (15:11 -0800)]
mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_INVOCATIONS constant.
Reviewed-by: Matt Turner <mattst88@gmail.com>
v2: Use CONTEXT_INT rather than CONTEXT_ENUM.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Paul Berry [Mon, 6 Jan 2014 21:31:58 +0000 (13:31 -0800)]
mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_SIZE constant.
v2: Document that the 3-element array MaxComputeWorkGroupSize is
indexed by dimension.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Paul Berry [Tue, 7 Jan 2014 23:50:39 +0000 (15:50 -0800)]
mesa/cs: Create the gl_compute_program struct, and the code to initialize it.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Paul Berry [Tue, 7 Jan 2014 17:00:02 +0000 (09:00 -0800)]
mesa/cs: Handle compute shaders in _mesa_use_program().
v2: do cs after the ordered pipeline stages for consistency.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Paul Berry [Tue, 7 Jan 2014 17:00:02 +0000 (09:00 -0800)]
glsl/cs: update main.cpp to use the ".comp" extension for compute shaders.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Paul Berry [Tue, 7 Jan 2014 04:06:05 +0000 (20:06 -0800)]
glsl/cs: Populate default values for ctx->Const.Program[MESA_SHADER_COMPUTE].
Reviewed-by: Matt Turner <mattst88@gmail.com>
Paul Berry [Tue, 7 Jan 2014 04:06:05 +0000 (20:06 -0800)]
mesa/cs: Add a MESA_SHADER_COMPUTE stage and update switch statements.
This patch adds MESA_SHADER_COMPUTE to the gl_shader_stage enum.
Also, where it is trivial to do so, it adds a compute shader case to
switch statements that switch based on the type of shader. This
avoids "unhandled switch case" compiler warnings.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Paul Berry [Tue, 7 Jan 2014 03:47:25 +0000 (19:47 -0800)]
glsl/cs: Change some linker loops to use MESA_SHADER_FRAGMENT as a bound.
Linker loops that iterate through all the stages in the pipeline need
to use MESA_SHADER_FRAGMENT as a bound, so that we can add an
additional MESA_SHADER_COMPUTE stage, without it being erroneously
included in the pipeline.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Paul Berry [Mon, 6 Jan 2014 23:08:04 +0000 (15:08 -0800)]
mesa/cs: Add dispatch API stubs for ARB_compute_shader.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Paul Berry [Mon, 6 Jan 2014 17:09:07 +0000 (09:09 -0800)]
mesa/cs: Add extension enable flags for ARB_compute_shader.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Roland Scheidegger [Tue, 4 Feb 2014 18:53:53 +0000 (19:53 +0100)]
gallivm: fix F2U opcode
Previously, we were really doing F2I. And also move it to generic section.
(Note that for llvmpipe the code generated is definitely bad, due to lack
of unsigned conversions with sse. I think though what llvm does (using scalar
conversions to 64bit signed either with x87 fpu (32bit) or sse (64bit)
including lots of domain changes is quite suboptimal, could do something like
is_large = arg >= 2^31
half_arg = 0.5 * arg
small_c = fptoint(arg)
large_c = fptoint(half_arg) << 1
res = select(is_large, large_c, small_c)
which should be much less instructions but that's something llvm should do
itself.)
This fixes piglit fs/vs-float-uint-conversion.shader_test (maybe more, needs
GL 3.0 version override to run.)
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
José Fonseca [Fri, 31 Jan 2014 16:44:39 +0000 (16:44 +0000)]
tools/trace: Handle index buffer overflow gracefully.
Trivial.
Dave Airlie [Tue, 4 Feb 2014 21:52:48 +0000 (07:52 +1000)]
docs/GL3.txt: update r600 status
This updates the r600 driver status to 3.3 being fully supported.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 30 Jan 2014 04:19:57 +0000 (04:19 +0000)]
r600g: add support for geom shaders to r600/r700 chipsets (v2)
This is my first attempt at enabling r600/r700 geometry shaders,
the basic tests pass on both my rv770 and my rv635,
It requires this kernel patch:
http://www.spinics.net/lists/dri-devel/msg52745.html
v2: address Alex comments.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Wed, 29 Jan 2014 21:48:09 +0000 (21:48 +0000)]
r600g: enable GLSL 3.30 on evergreen GPUs
This throws the switch to enable GL 3.3 and GLSL 330.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Tue, 4 Feb 2014 00:48:42 +0000 (10:48 +1000)]
r600g: properly propogate clip dist write value
This moves the value from the GS shader to the copy shader so the registers
are setup correctly.
fixes tests/spec/glsl-1.50/execution/geometry/clip-distance-out-values.shader_test
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Mon, 3 Feb 2014 05:31:26 +0000 (15:31 +1000)]
r600g: calculate a better value for array_size (v2)
attempt to calculate a better value for array size to avoid breaking apps.
v2: use 0xfff like streamout, suggested by Grigori
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Fri, 31 Jan 2014 03:35:51 +0000 (03:35 +0000)]
r600g: fix CAYMAN geometry shader support
cayman has a different end of program bit, so do that properly.
fixes hangs with geom shader tests on cayman.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Wed, 29 Jan 2014 00:17:15 +0000 (00:17 +0000)]
r600g: fix up shader out misc stuff for copy shader
set the correct values so the misc out register is setup correctly
for the copy shader.
This also updates the state for the gs copy shader so the hw
gets programmed correctly.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Tue, 28 Jan 2014 23:15:29 +0000 (23:15 +0000)]
r600g: port the layered surface rendering patch from radeonsi
This just makes r600 and evergreen do what the radeonsi codepaths do
for layered rendering. This makes the 2d amd_vertex_shader_layer test
pass on evergreen.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Tue, 28 Jan 2014 03:04:00 +0000 (13:04 +1000)]
r600g: initial VS output layer support
This just adds support for emitting the proper value in the VS out misc.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Tue, 28 Jan 2014 02:06:49 +0000 (12:06 +1000)]
r600g: setup const texture buffers for geom shaders
This just enables the workarounds we have for vertex/pixel shaders
for geom shaders as well.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Fri, 24 Jan 2014 07:14:26 +0000 (17:14 +1000)]
r600g: calculate correct cut value
This selects the cut value depending on the shader selected.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Fri, 24 Jan 2014 04:46:37 +0000 (14:46 +1000)]
r600g: fix dynamic_input_array_index.shader_test
This follows what fglrx does, it unpacks the input we are
going to indirect into a bunch of registers and indirects
inside them.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Fri, 24 Jan 2014 03:39:36 +0000 (13:39 +1000)]
r600g: add support for indirect geom ring writes
We need to be able to write to the ring using a base register
for when we emit vertices in a loop, in theory the SB compiler
could collapse these indirect writes to direct writes if the
register value is constant and known, but that is outside my
pay grade.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Tue, 24 Dec 2013 05:59:19 +0000 (05:59 +0000)]
r600g: write proper output prim type
Vadim's code derived it from the info.mode, but it needs
to be takes from the geometry shader output primitive.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Tue, 24 Dec 2013 05:30:37 +0000 (05:30 +0000)]
r600g: enable instance cnt register with new enough kernel
The instance cnt register was missing for a few kernels,
with a new enough kernel we can output it.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Mon, 23 Dec 2013 01:30:03 +0000 (01:30 +0000)]
r600g: add primitive input support for gs
only enable prim id if gs uses it
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Thu, 19 Dec 2013 05:17:00 +0000 (05:17 +0000)]
r600g: emit streamout from dma copy shader
This enables streamout with GS in the mix, from the
VS dma shader.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Wed, 18 Dec 2013 05:55:07 +0000 (15:55 +1000)]
r600g/gs: fix cases where number of gs inputs != number of gs outputs
this fixes a bunch of the geom shader built-in tests
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Tue, 28 Jan 2014 00:21:03 +0000 (10:21 +1000)]
r600g: increase array base for exported parameters
Trivial fix to Vadim's code.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Fri, 24 Jan 2014 06:41:32 +0000 (16:41 +1000)]
r600g: initialise the geom shader loop registers.
As we do for vertex and pixel shaders.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Sat, 30 Nov 2013 06:26:13 +0000 (06:26 +0000)]
r600g: emit NOPs at end of shaders in more cases
If the shader has no CF clauses at all emit an nop
If the last instruction is an ENDLOOP add a NOP for the LOOP to go to
if the last instruction is CALL_FS add a NOP
These fix a bunch of hangs in the geometry shader tests.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Thu, 28 Nov 2013 23:38:35 +0000 (23:38 +0000)]
r600g: don't enable SB for geom shaders
SB needs fixes for three GS instructions it seems to raise
them outside loops etc despite my best efforts.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Tue, 24 Dec 2013 04:56:25 +0000 (04:56 +0000)]
r600g/sb: add MEM_RING support
Although we don't use SB on geom shaders, the VS copy shader will use it
so we might as well implement MEM_RING support in sb.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Wed, 29 Jan 2014 04:08:43 +0000 (04:08 +0000)]
r600g: don't fail if we can't map VS->GS ring entries
This can happen in normal operation, so don't report an error on it,
just continue.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Vadim Girlin [Fri, 2 Aug 2013 02:38:23 +0000 (06:38 +0400)]
r600g: initial support for geometry shaders on evergreen (v2)
This is Vadim's initial work with a few regression fixes squashed in.
v2: (airlied)
fix regression in glsl-max-varyings - need to use vs and ps_dirty
fix regression in shader exports from rebasing.
whitespace fixing.
v2.1: squash fix assert
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Vadim Girlin [Fri, 2 Aug 2013 02:32:32 +0000 (06:32 +0400)]
r600g: add hw register definitions for GS block setup
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Vadim Girlin [Wed, 31 Jul 2013 19:09:39 +0000 (23:09 +0400)]
r600g: defer shader variant selection and depending state updates
[airlied: fix dropped streamout line - fix for master]
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Mon, 13 Jan 2014 00:19:00 +0000 (10:19 +1000)]
r600g/bc: add support for indexed memory writes.
It looks like we need these for geom shaders in the future.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Vadim Girlin [Wed, 31 Jul 2013 16:02:22 +0000 (20:02 +0400)]
r600g: move barrier and end_of_program bits from output to cf struct (v2)
v2: fix regression on r600 NOP instructions.
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Wed, 29 Jan 2014 01:33:14 +0000 (01:33 +0000)]
r600g: split streamout emit code into a separate function
For geometry shaders we need to call this code from a second place.
Just move it out for now to keep future patches cleaner.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Sat, 1 Feb 2014 14:06:39 +0000 (15:06 +0100)]
r600g,radeonsi: skip unnecessary buffer_is_busy call, add a comment
Marek Olšák [Sat, 1 Feb 2014 13:59:28 +0000 (14:59 +0100)]
r600g,radeonsi: skip busy-checking for DISCARD_RANGE if it has been done already
Marek Olšák [Sat, 1 Feb 2014 13:01:20 +0000 (14:01 +0100)]
r600g,radeonsi: treat DYNAMIC and STREAM usage as STAGING
Marek Olšák [Fri, 17 Jan 2014 21:52:28 +0000 (22:52 +0100)]
gallium: remove PIPE_CAP_MAX_COMBINED_SAMPLERS
This can be derived from the shader caps.
All GPUs from ATI/AMD, NVIDIA, and INTEL have separate texture slots
for each shader stage.
Brian Paul [Tue, 4 Feb 2014 17:38:59 +0000 (10:38 -0700)]
mesa: remove stray bits of GL_EXT_cull_vertex
GL_EXT_cull_vertex was removed back in 2010 in commit
02984e3536
but these bits still lingered.
Reviewed-by: Eric Anholt <eric@anholt.net>
Paul Berry [Fri, 31 Jan 2014 17:55:35 +0000 (09:55 -0800)]
glsl: Fix continue statements in do-while loops.
From the GLSL 4.40 spec, section 6.4 (Jumps):
The continue jump is used only in loops. It skips the remainder of
the body of the inner most loop of which it is inside. For while
and do-while loops, this jump is to the next evaluation of the
loop condition-expression from which the loop continues as
previously defined.
Previously, we incorrectly treated a "continue" statement as jumping
to the top of a do-while loop.
This patch fixes the problem by replicating the loop condition when
converting the "continue" statement to IR. (We already do a similar
thing in "for" loops, to ensure that "continue" causes the loop
expression to be executed).
Fixes piglit tests:
- glsl-fs-continue-inside-do-while.shader_test
- glsl-vs-continue-inside-do-while.shader_test
- glsl-fs-continue-in-switch-in-do-while.shader_test
- glsl-vs-continue-in-switch-in-do-while.shader_test
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Paul Berry [Fri, 31 Jan 2014 17:50:37 +0000 (09:50 -0800)]
glsl: Make condition_to_hir() callable from outside ast_iteration_statement.
In addition to making it public, we also need to change its first
argument from an ir_loop * to an exec_list *, so that it can be used
to insert the condition anywhere in the IR (rather than just in the
body of the loop).
This will be necessary in order to make continue statements work
properly in do-while loops.
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Topi Pohjolainen [Mon, 27 Jan 2014 08:50:01 +0000 (10:50 +0200)]
i965/blorp: do not use unnecessary hw-blending support
This is really not needed as blorp blit programs already sample
XRGB normally and get alpha channel set to 1.0 automatically by
the sampler engine. This is simply copied directly to the payload
of the render target write message and hence there is no need for
any additional blending support from the pixel processing pipeline.
The blending formula is anyway broken for color components, it
multiplies the color component with itself (blend factor is the
component itself).
Alpha blending in turn would not fix the alpha to one independent
of the source but simply used the source alpha as is instead
(1.0 * src_alpha + 0.0 * dst_alpha).
Quoting Eric:
"If we want to actually make the no-alpha-bits-present thing work,
we need to override the bits in the surface state or in the
generated code. In the normal draw path, it's done for sampling
by the swizzling code in brw_wm_surface_state.c, and the blending
overrides is just to fix up the alpha blending stage which
doesn't pay attention to that for the destination surface."
If one modifies piglit test gl-3.2-layered-rendering-blit to use
color component values other than zero or one, this change will
kick in on IVB. No regressions on IVB.
This is effectively revert of
c0554141a9b831b4e614747104dcbbe0fe489b9d:
i965/blorp: Support overriding destination alpha to 1.0.
Currently, Blorp requires the source and destination formats to be
equal. However, we'd really like to be able to blit between XRGB and
ARGB formats; our BLT engine paths have supported this for a long time.
For ARGB -> XRGB, nothing needs to occur: the missing alpha is already
interpreted as 1.0. For XRGB -> ARGB, we need to smash the alpha
channel to 1.0 when writing the destination colors. This is fairly
straightforward with blending.
For now, this code is never used, as the source and destination formats
still must be equal. The next patch will relax that restriction.
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Christian König [Mon, 3 Feb 2014 09:28:58 +0000 (02:28 -0700)]
radeon/uvd: fix feedback buffer handling v2
Without the correct feedback buffer size UVD runs
into an error on each frame, reducing the maximum FPS.
v2: fixing Michels comments
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Cc: "10.1" "10.0" "9.2" <mesa-stable@lists.freedesktop.org>
Kenneth Graunke [Wed, 29 Jan 2014 17:27:09 +0000 (09:27 -0800)]
i965: Use brw_bo_map[_gtt]() in intel_miptree_map_raw().
This moves the intel_batchbuffer_flush before the drm_intel_bo_busy
call, which is a change in behavior. However, the old behavior was
broken.
In the future, we may want to only flush in the batchbuffer references
the BO being mapped. That's certainly more typical.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Wed, 29 Jan 2014 17:24:32 +0000 (09:24 -0800)]
i965: Use brw_bo_map() in intel_texsubimage_tiled_memcpy().
This additionally measures the time stalled, while also simplifying the
code.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Wed, 29 Jan 2014 17:09:18 +0000 (09:09 -0800)]
i965: Create drm_intel_bo_map wrappers with performance warnings.
Mapping a buffer is a common place where we could stall the CPU.
In a few places, we've added special code to check whether a buffer is
busy and log the stall as a performance warning. Most of these give no
indication of the severity of the stall, though, since measuring the
time is a small hassle.
This patch introduces a new brw_bo_map() function which wraps
drm_intel_bo_map, but additionally measures the time stalled and reports
a performance warning. If performance debugging is not enabled, it
simply maps the buffer with negligable overhead.
We also add a similar wrapper for drm_intel_gem_bo_map_gtt().
This should make it easy to add performance warnings in lots of places.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Rob Clark [Mon, 3 Feb 2014 16:28:30 +0000 (11:28 -0500)]
freedreno: enabling binning and opt by default
Hw binning pass doesn't seem to have broken anything. And optimizing
compiler fixes a lot of shaders and doesn't seem to break anything. So
re-org slightly FD_MESA_DEBUG params and make both hw binning and
optimizer enabled by default.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Wed, 29 Jan 2014 22:18:49 +0000 (17:18 -0500)]
freedreno/a3xx/compiler: new compiler
The new compiler generates a dependency graph of instructions, including
a few meta-instructions to handle PHI and preserve some extra
information needed for register assignment, etc.
The depth pass assigned a weight/depth to each node (based on sum of
instruction cycles of a given node and all it's dependent nodes), which
is used to schedule instructions. The scheduling takes into account the
minimum number of cycles/slots between dependent instructions, etc.
Which was something that could not be handled properly with the original
compiler (which was more of a naive TGSI translator than an actual
compiler).
The register assignment is currently split out as a standalone pass. I
expect that it will be replaced at some point, once I figure out what to
do about relative addressing (which is currently the only thing that
should cause fallback to old compiler).
There are a couple new debug options for FD_MESA_DEBUG env var:
optmsgs - enable debug prints in optimizer
optdump - dump instruction graph in .dot format, for example:
http://people.freedesktop.org/~robclark/a3xx/frag-0000.dot.png
http://people.freedesktop.org/~robclark/a3xx/frag-0000.dot
At this point, thanks to proper handling of instruction scheduling, the
new compiler fixes a lot of things that were broken before, and does not
appear to break anything that was working before[1]. So even though it
is not finished, it seems useful to merge it in it's current state.
[1] Not merged in this commit, because I'm not sure if it really belongs
in mesa tree, but the following commit implements a simple shader
emulator, which I've used to compare the output of the new compiler to
the original compiler (ie. run it on all the TGSI shaders dumped out via
ST_DEBUG=tgsi with various games/apps):
https://github.com/freedreno/mesa/commit/
163b6306b1660e05ece2f00d264a8393d99b6f12
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Wed, 29 Jan 2014 22:03:07 +0000 (17:03 -0500)]
freedreno/a3xx/compiler: split out old compiler
For the time being, keep old compiler as fallback for things that the
new compiler does not support yet. Split out as it's own commit to make
the later new-compiler commits easier to follow.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Wed, 29 Jan 2014 21:25:52 +0000 (16:25 -0500)]
freedreno/a3xx/compiler: prepare for new compiler
Shuffle things around to prepare for new compiler.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Wed, 29 Jan 2014 21:13:54 +0000 (16:13 -0500)]
freedreno/a3xx: remove useless reg tracking in disasm-a3xx
Not really used for anything anymore. So strip it out and avoid
conflicting symbols with upcoming new-compiler.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Carl Worth [Mon, 3 Feb 2014 21:54:50 +0000 (13:54 -0800)]
docs: Add release notes for 10.0.3
Which was just made.
Brian Paul [Mon, 3 Feb 2014 18:33:03 +0000 (11:33 -0700)]
draw: fix incorrect color of flat-shaded clipped lines
When we clipped a line weren't copying the provoking vertex
color to the second vertex. We also weren't checking for
first vs. last provoking vertex.
Fixes failures found with the new piglit line-flat-clip-color test.
Cc: "10.0, 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Brian Paul [Sat, 1 Feb 2014 17:51:43 +0000 (10:51 -0700)]
mesa: change GL_ALL_ATTRIB_BITS to 0xFFFFFFFF
This has been wrong for many years. It was originally 0x000FFFFF and long
ago there was discussion about whether GL_ALL_ATTRIB_BITS should include
the then-new GL_MULTISAMPLE_BIT bit. Eventually the ARB decided that
glPushAttrib(GL_ALL_ATTRIB_BITS) should save all current and future
attribute groups (hence ~0). Unfortunately, Mesa's gl.h was never updated.
This was just recently spotted by Eric Anholt and reported as a bug to the
ARB. Ian, Jon Leech and I discussed it at the ARB meeting and decided to
change Mesa's value to reflect the ARB's decision.
Acked-by: Eric Anholt <eric@anholt.net>
Brian Paul [Sat, 1 Feb 2014 00:27:04 +0000 (17:27 -0700)]
gallium/auxiliary/indices: replace free() with FREE()
To match the CALLOC_STRUCT() call.
Cc: "10.0, 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Brian Paul [Sat, 1 Feb 2014 00:23:11 +0000 (17:23 -0700)]
svga: check shader size against max command buffer size
If the shader is too large, plug in a dummy shader. This patch also
reworks the existing dummy shader code.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Brian Paul [Sat, 1 Feb 2014 00:23:11 +0000 (17:23 -0700)]
svga: refactor some shader code
Put common code in new svga_shader.c file. Considate separate vertex/
fragment shader ID generation.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Zack Rusin [Tue, 28 Jan 2014 21:34:18 +0000 (16:34 -0500)]
gallivm: fix opcode and function nesting
gallivm soa code supported only a single level of nesting for
control flow opcodes (if, switch, loops...) but the d3d10 spec
clearly states that those are nested within functions. To support
nesting of conditionals inside functions we need to store the
nesting data inside function contexts and keep a stack of those.
Furthermore we make sure that if nesting for subroutines is deeper
than 32 then we simply ignore all subsequent 'call' invocations.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Kenneth Graunke [Sun, 2 Feb 2014 06:04:13 +0000 (22:04 -0800)]
mesa: Drop unnecessary (void) ctx from VAO code.
ctx is always used, even on release builds.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Kenneth Graunke [Sun, 2 Feb 2014 05:25:42 +0000 (21:25 -0800)]
mesa: Remove "APPLE" from some VAO error messages.
Chances are, people will be using the core names these days.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Kenneth Graunke [Sun, 2 Feb 2014 04:17:06 +0000 (20:17 -0800)]
mesa: Update some comments relating to VAOs.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Kenneth Graunke [Sun, 2 Feb 2014 03:46:45 +0000 (19:46 -0800)]
mesa: Rename ElementArrayBufferObj to IndexBufferObj.
DirectX and most hardware documentation use the term "Index Buffer" to
refer to a buffer containing indexes into arrays of vertex data, which
allows random access to vertex data, rather than sequential access.
OpenGL uses a different term for this concept: "Element Array Buffer".
However, "Index Buffer" has become much more widespread. A quick
Google search shows 29,300 hits for "Element Array Buffer" vs.
82,300 hits for "Index Buffer."
Arguably, "Index Buffer" is clearer: an "element of an array" (or list)
usually refers to an actual item stored in the array, not the index used
to refer to it.
The terminology is also already used in Mesa: some VBO module code for
dealing with ElementArrayBufferObj names local variables "ib".
Completely generated by:
$ find . -type f -print0 | xargs -0 sed -i \
's/ElementArrayBufferObj/IndexBufferObj/g'
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Kenneth Graunke [Sun, 2 Feb 2014 05:28:03 +0000 (21:28 -0800)]
mesa: Rename _mesa_lookup_arrayobj to _mesa_lookup_vao.
For consistency with the previous renames.
Completely generated by:
$ find . -type f -print0 | xargs -0 sed -i \
's/_mesa_lookup_arrayobj/_mesa_lookup_vao/g'
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Kenneth Graunke [Sun, 2 Feb 2014 04:48:51 +0000 (20:48 -0800)]
mesa: Rename _mesa_..._array_obj functions to _mesa_..._vao.
_mesa_update_vao_client_arrays() is less of a mouthful than
_mesa_update_array_object_client_arrays(), and generally clearer.
Generated by:
$ find . -type f -print0 | xargs -0 sed -i \
's/_mesa_\([^_]*\)_array_object/_mesa_\1_vao/g'
with manual whitespace and indentation fixes applied.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Kenneth Graunke [Sun, 2 Feb 2014 03:36:59 +0000 (19:36 -0800)]
mesa: Rename "struct gl_array_object" to gl_vertex_array_object.
I considered replacing it with "gl_vao", but spelling it out seemed to
fit better with Mesa's traditional style. Mesa doesn't shy away from
long type names - consider gl_transform_feedback_object,
gl_fragment_program_state, gl_uniform_buffer_binding, and so on.
Completely generated by:
$ find . -type f -print0 | xargs -0 sed -i \
's/gl_array_object/gl_vertex_array_object/g'
v2: Rerun command to resolve conflicts with Ian's meta patches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Kenneth Graunke [Sun, 2 Feb 2014 03:31:22 +0000 (19:31 -0800)]
mesa: Rename "arrayObj" local variables to "vao".
Now that the field is named "VAO" instead of "ArrayObj", it makes sense
to call the local variables "vao" instead of "arrayObj".
Completely generated by:
$ find . -type f -print0 | xargs 0 sed -i 's/arrayObj/vao/g'
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Kenneth Graunke [Sun, 2 Feb 2014 03:14:38 +0000 (19:14 -0800)]
mesa: Rename ArrayObj to VAO and DefaultArrayObj to DefaultVAO.
When reading through the Mesa drawing code, it's not immediately obvious
to me that "ArrayObj" (gl_array_object) is the Vertex Array Object (VAO)
state. The comment above the structure explains this, but readers still
have to remember this and translate accordingly.
Out of context, "array object" is a fairly vague. Even in context,
"array" has a lot of meanings: glDrawArrays, vertex data stored in user
arrays, gl_client_arrays, gl_vertex_attrib_arrays, and so on.
Using the term "VAO" immediately associates these fields with the OpenGL
concept, clarifying the situation and aiding programmer sanity.
Completely generated by:
$ find . -type f -print0 | xargs -0 sed -i \
-e 's/ArrayObj;/VAO;/g' \
-e 's/->ArrayObj/->VAO/g' \
-e 's/Array\.ArrayObj/Array.VAO/g' \
-e 's/Array\.DefaultArrayObj/Array.DefaultVAO/g'
v2: Rerun command to resolve conflicts with Ian's meta patches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Ian Romanick [Sat, 14 Dec 2013 00:51:04 +0000 (16:51 -0800)]
meta: Silence several 'unused parameter' warnings
Silences many GCC warnings of the form:
drivers/common/meta.c: In function 'cleanup_temp_texture':
drivers/common/meta.c:1208:41: warning: unused parameter 'ctx' [-Wunused-parameter]
drivers/common/meta.c: In function 'setup_ff_blit_framebuffer':
drivers/common/meta.c:1453:46: warning: unused parameter 'ctx' [-Wunused-parameter]
drivers/common/meta.c: In function 'meta_glsl_blit_cleanup':
drivers/common/meta.c:1998:43: warning: unused parameter 'ctx' [-Wunused-parameter]
drivers/common/meta.c: In function 'meta_glsl_clear_cleanup':
drivers/common/meta.c:2287:44: warning: unused parameter 'ctx' [-Wunused-parameter]
drivers/common/meta.c: In function 'setup_ff_generate_mipmap':
drivers/common/meta.c:3365:45: warning: unused parameter 'ctx' [-Wunused-parameter]
drivers/common/meta.c: In function 'meta_glsl_generate_mipmap_cleanup':
drivers/common/meta.c:3556:54: warning: unused parameter 'ctx' [-Wunused-parameter]
There are a couple other similar warnings, but they are less trivial. I
want to investigate these further before axing them.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Ian Romanick [Fri, 13 Dec 2013 21:40:48 +0000 (13:40 -0800)]
meta: Don't use fixed-function to decompress array textures
Array textures can't be used with fixed-function, so don't. Instead,
just drop the decompress request on the floor. This is no worse than
what was done previously because generating the GL error (in
_mesa_set_enable) broke everything anyway.
A later patch will get GL_TEXTURE_2D_ARRAY targets working.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Ian Romanick [Fri, 13 Dec 2013 22:12:09 +0000 (14:12 -0800)]
meta: Use NDC in decompress_texture_image
There is no need to use pixel coordinates, and using NDC directly will
simplify the GLSL paths.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Ian Romanick [Sat, 14 Dec 2013 19:27:29 +0000 (11:27 -0800)]
meta: Consistenly use non-Apple VAO functions
For these objects, meta was already using the non-Apple function to
delete the objects. Everywhere else in the file uses
_mesa_GenVertexArrays and _mesa_BindVertexArrays.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
Ian Romanick [Fri, 13 Dec 2013 23:59:38 +0000 (15:59 -0800)]
meta: Fallback to software for GetTexImage of compressed GL_TEXTURE_CUBE_MAP_ARRAY
The hardware decompression path isn't even close to being able to handle
this. This converts the crash (assertion failure) in
"EXT_texture_compression_s3tc/getteximage-targets S3TC CUBE_ARRAY" to a
plain old failure.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
Ian Romanick [Sat, 14 Dec 2013 19:58:45 +0000 (11:58 -0800)]
meta: Release resources used by _mesa_meta_DrawPixels
_mesa_meta_DrawPixels creates a VAO and (potentially) two fragment
programs, but none of them are ever released. Leaking piles of memory
is generally frowned upon.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
Ian Romanick [Fri, 13 Dec 2013 22:36:17 +0000 (14:36 -0800)]
meta: Release resources used by decompress_texture_image
decompress_texture_image creates an FBO, an RBO, a VBO, a VAO, and a
sampler object, but none of them are ever released. Later patches will
add program objects, exacerbating the problem. Leaking piles of memory
is generally frowned upon.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
Ian Romanick [Sat, 23 Nov 2013 20:16:57 +0000 (12:16 -0800)]
mesa: Use common _mesa_tex_target_to_index in tex param code
TEXTURE_BUFFER_INDEX has to be specially called out because it is not
allowed in any of the glTexParameter or glGetTexParameter functions.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Ian Romanick [Fri, 22 Nov 2013 19:35:30 +0000 (11:35 -0800)]
mesa: Make target_enum_to_index available outside texobj.c
The next patch will use this function in another file.
v2: Rename _mesa_target_enum_to_index to _mesa_tex_target_to_index.
Suggested by Brian.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Brian Paul [Sat, 1 Feb 2014 15:58:43 +0000 (08:58 -0700)]
mesa: make several FBO functions static
The four functions in question weren't called from any other file.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Brian Paul [Sat, 1 Feb 2014 15:58:43 +0000 (08:58 -0700)]
mesa: move glGenerateMipmap() code into new genmipmap.c file
Mipmap generation has nothing to do with FBOs.
v2: update gl_genexec.py too (not api_exec.c)
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Brian Paul [Sat, 1 Feb 2014 15:58:43 +0000 (08:58 -0700)]
mesa: move glBlitFramebuffer code into new blit.c file
Just for better organization.
v2: update gl_genexec.py too (not api_exec.c)
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Brian Paul [Sat, 1 Feb 2014 00:38:35 +0000 (17:38 -0700)]
mesa: don't signal _NEW_TEXTURE in TexSubImage() functions
glTexSubImage(), glCopyTexSubImage() and glCompressedTexSubImage()
only change the texel data, not other state like texture size or format.
If a driver really needs do something special it can hook into the
corresponding driver functions or Map/UnmapTextureImage().
This should avoid some needless state validation effort.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Brian Paul [Sat, 1 Feb 2014 00:38:35 +0000 (17:38 -0700)]
mesa: add some comments about mipmap generation
Trivial.
Brian Paul [Sat, 1 Feb 2014 00:38:35 +0000 (17:38 -0700)]
mesa: simplify comment in texstorage.c
Trivial.
Brian Paul [Sat, 1 Feb 2014 00:38:35 +0000 (17:38 -0700)]
mesa: formatting fixes, 78-column wrappings in dd.h
Trivial.
Brian Paul [Sat, 1 Feb 2014 00:28:08 +0000 (17:28 -0700)]
mesa: remove target param from ctx->Driver.TexParameter()
Not really used anywhere.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Brian Paul [Sat, 1 Feb 2014 00:28:08 +0000 (17:28 -0700)]
gallivm: add a few const qualifiers
Trivial.
Brian Paul [Sat, 1 Feb 2014 00:27:04 +0000 (17:27 -0700)]
translate: reindent translate_sse.c
Trivial.