mesa.git
10 years agoglsl: Initialize ubo_binding_mask flags to zero.
Matt Turner [Mon, 3 Feb 2014 19:51:51 +0000 (11:51 -0800)]
glsl: Initialize ubo_binding_mask flags to zero.

Missed in commit e63bb298. Caused sporadic test failures, like
incorrect-in-layout-qualifier-repeated-prim.geom.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agogallium/radeon: fix warnings
Marek Olšák [Thu, 6 Feb 2014 16:43:29 +0000 (17:43 +0100)]
gallium/radeon: fix warnings

10 years agogallium: remove PIPE_USAGE_STATIC
Marek Olšák [Mon, 3 Feb 2014 02:42:17 +0000 (03:42 +0100)]
gallium: remove PIPE_USAGE_STATIC

Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agogallium: define the behavior of PIPE_USAGE_* flags properly
Marek Olšák [Mon, 3 Feb 2014 02:21:29 +0000 (03:21 +0100)]
gallium: define the behavior of PIPE_USAGE_* flags properly

STATIC will be removed in the following commit.

v2: changed the definition of IMMUTABLE

Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agogallium: remove PIPE_RESOURCE_FLAG_GEN_MIPS
Marek Olšák [Mon, 3 Feb 2014 02:20:13 +0000 (03:20 +0100)]
gallium: remove PIPE_RESOURCE_FLAG_GEN_MIPS

Unused.

Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agor600g,radeonsi: set resource domains in one place (v2)
Marek Olšák [Tue, 4 Feb 2014 17:35:40 +0000 (18:35 +0100)]
r600g,radeonsi: set resource domains in one place (v2)

v2: This doesn't change the behavior. It only moves the tiling check
    to r600_init_resource and removes the usage parameter.

Reviewed-by: Christian König <christian.koenig@amd.com>
10 years agost/mesa: fix crash when a shader uses a TBO and it's not bound
Marek Olšák [Thu, 6 Feb 2014 01:16:50 +0000 (02:16 +0100)]
st/mesa: fix crash when a shader uses a TBO and it's not bound

This binds a NULL sampler view in that case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74251

Cc: "10.1" "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agost/omx: add workaround for bug in Bellagio
Christian König [Tue, 28 Jan 2014 13:21:14 +0000 (06:21 -0700)]
st/omx: add workaround for bug in Bellagio

Not blocking for the message thread can lead to accessing freed up memory.

Signed-off-by: Christian König <christian.koenig@amd.com>
10 years agost/omx: initial OpenMAX support v3
Christian König [Mon, 5 Aug 2013 17:41:27 +0000 (11:41 -0600)]
st/omx: initial OpenMAX support v3

Featuring a full grown MPEG2 and H264 decoder and a couple of hundred bugs.

v2 (Leo): fix an error for pic_order_cnt_type 1
v3 (Leo): implement support for field decoding

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Leo Liu <leo.liu@amd.com>
10 years agovl/rbsp: add H.264 RBSP implementation
Christian König [Tue, 17 Sep 2013 14:20:32 +0000 (08:20 -0600)]
vl/rbsp: add H.264 RBSP implementation

Signed-off-by: Christian König <christian.koenig@amd.com>
10 years agovl/vlc: add function to limit the vlc size
Christian König [Tue, 17 Sep 2013 13:27:38 +0000 (07:27 -0600)]
vl/vlc: add function to limit the vlc size

Signed-off-by: Christian König <christian.koenig@amd.com>
10 years agovl/vlc: add remove bits function
Christian König [Tue, 17 Sep 2013 13:22:34 +0000 (07:22 -0600)]
vl/vlc: add remove bits function

Signed-off-by: Christian König <christian.koenig@amd.com>
10 years agoradeon: update legal notes on UVD
Christian König [Mon, 3 Feb 2014 17:12:43 +0000 (10:12 -0700)]
radeon: update legal notes on UVD

Signed-off-by: Christian König <christian.koenig@amd.com>
10 years agoradeon: just don't map VRAM buffers at all
Christian König [Mon, 27 Jan 2014 10:40:25 +0000 (03:40 -0700)]
radeon: just don't map VRAM buffers at all

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
10 years agoradeon/video: directly create buffers in the right domain
Christian König [Tue, 21 Jan 2014 18:49:06 +0000 (11:49 -0700)]
radeon/video: directly create buffers in the right domain

Avoid moving things around on start of stream.

Signed-off-by: Christian König <christian.koenig@amd.com>
10 years agoradeon/video: seperate common video functions
Christian König [Thu, 17 Oct 2013 12:21:40 +0000 (06:21 -0600)]
radeon/video: seperate common video functions

Signed-off-by: Christian König <christian.koenig@amd.com>
10 years agogallium/dri2: Fix dri2_dup_image
Axel Davy [Thu, 30 Jan 2014 15:10:54 +0000 (16:10 +0100)]
gallium/dri2: Fix dri2_dup_image

dri2_dup_image was not copying the dri_format field.

This was causing some bugs, for example:
. we create an gbm_bo.
. we get an EGLImage from the gbm_bo.
. Bug: impossible to get again the gbm_bo from the EGLImage by
  importing. (gbm dri2 backend)

Signed-off-by: Axel Davy <axel.davy@ens.fr>
10 years agoi965/vs: Fix typo in brw_compute_vue_map
Chris Forbes [Sat, 25 Jan 2014 06:51:50 +0000 (19:51 +1300)]
i965/vs: Fix typo in brw_compute_vue_map

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Fix register types in dump_instructions().
Kenneth Graunke [Wed, 5 Feb 2014 21:27:15 +0000 (13:27 -0800)]
i965: Fix register types in dump_instructions().

This regressed when I converted BRW_REGISTER_TYPE_* to be an abstract
type that doesn't match the hardware description.  dump_instruction()
was using reg_encoding[] from brw_disasm.c, which no longer matches
(and was incorrect for Gen8+ anyway).

This patch introduces a new function to convert the abstract enum values
into the letter suffix we expect.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reported-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoegl/glx: Remove egl_glx driver
Chad Versace [Tue, 7 Jan 2014 20:08:30 +0000 (12:08 -0800)]
egl/glx: Remove egl_glx driver

Mesa now has a real, feature-rich EGL implementation on X11 via xcb.
Therefore I believe there is no longer a practical need for the egl_glx
driver.

Furthermore, egl_glx appears to be unmaintained.  The most recent
nontrivial commit to egl_glx was 6baa5f1 on 2011-11-25.

Tested by running weston-smoke in windowed Weston on X with i965.

Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
10 years agodocs: update 10.1 relnotes to note GL 3.3 on r600 and radeonsi.
Dave Airlie [Thu, 6 Feb 2014 01:03:09 +0000 (01:03 +0000)]
docs: update 10.1 relnotes to note GL 3.3 on r600 and radeonsi.

Signed-off-by: Dave Airlie <airlied@redhat.com>
10 years agotgsi/ureg: increase the number of immediates
Zack Rusin [Wed, 5 Feb 2014 00:33:12 +0000 (19:33 -0500)]
tgsi/ureg: increase the number of immediates

ureg_program is allocated on the heap so we can just bump the
number of immediates that it can handle. It's needed for d3d10.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agogallivm: make sure analysis works with large number of immediates
Zack Rusin [Wed, 5 Feb 2014 00:32:04 +0000 (19:32 -0500)]
gallivm: make sure analysis works with large number of immediates

We need to handle a lot more immediates and in order to do that
we also switch from allocating this structure on the stack to
allocating it on the heap.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agogallivm: handle huge number of immediates
Zack Rusin [Wed, 5 Feb 2014 00:28:58 +0000 (19:28 -0500)]
gallivm: handle huge number of immediates

We only supported up to 256 immediates, which isn't enough. We had
code which was allocating immediates as an allocated array, but it
was always used along a statically backed array for performance
reasons. This commit adds code to skip that performance optimization
and always use just the dynamically allocated immediates if the
number of them is too great.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agogallivm: allow large numbers of temporaries
Zack Rusin [Tue, 4 Feb 2014 02:40:24 +0000 (21:40 -0500)]
gallivm: allow large numbers of temporaries

The number of allowed temporaries increases almost with every
iteration of an api. We used to support 128, then we started
increasing and the newer api's support 4096+. So if we notice
that the number of temporaries is larger than our statically
allocated storage would allow we just treat them as indexable
temporaries and allocate them as an array from the start.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agoi965/fs: Assume FBO rendering in precompile if MRT.
Chris Forbes [Sat, 25 Jan 2014 22:04:42 +0000 (11:04 +1300)]
i965/fs: Assume FBO rendering in precompile if MRT.

If multiple color outputs are written, this shader is unlikely to be
useful with a winsys framebuffer.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965/fs: Guess nr_color_regions better in precompile
Chris Forbes [Sat, 25 Jan 2014 22:03:33 +0000 (11:03 +1300)]
i965/fs: Guess nr_color_regions better in precompile

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agodocs: Add relnotes for 10.2
Chris Forbes [Wed, 5 Feb 2014 21:17:17 +0000 (10:17 +1300)]
docs: Add relnotes for 10.2

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agomesa: Bump version to 10.2.0-devel
Chris Forbes [Wed, 5 Feb 2014 21:14:40 +0000 (10:14 +1300)]
mesa: Bump version to 10.2.0-devel

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965: Move intel_prepare_render() above first buffer access
Kristian Høgsberg [Wed, 5 Feb 2014 18:59:02 +0000 (10:59 -0800)]
i965: Move intel_prepare_render() above first buffer access

The driver is supposed to ensure buffers before any drawing operation, but in
do_blit_drawpixels() and do_blit_copypixels() we inspect the buffer format
before calling intel_prepare_render().  That was covered up by the
unconditional call to intel_prepare_render() in intelMakeCurrent(), but we
now only do this on the initial intelMakeCurrent call for a context
(to get the size for the initial viewport values).

https://bugs.freedesktop.org/show_bug.cgi?id=74083

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Tested-by: Alexander Monakov <amonakov@gmail.com>
10 years agost/mesa: add MESA_SHADER_COMPUTE case in shader_stage_to_ptarget()
Brian Paul [Wed, 5 Feb 2014 17:45:14 +0000 (10:45 -0700)]
st/mesa: add MESA_SHADER_COMPUTE case in shader_stage_to_ptarget()

Silences compiler warning.  Trivial.

10 years agomesa: re-wrap, fix-up comment text in formats.h
Brian Paul [Tue, 4 Feb 2014 19:19:42 +0000 (12:19 -0700)]
mesa: re-wrap, fix-up comment text in formats.h

Wrap to 78 columns, fix comment formatting.
Trivial.

10 years agoi965/cs: Allow ARB_compute_shader to be enabled via env var.
Paul Berry [Mon, 6 Jan 2014 23:12:05 +0000 (15:12 -0800)]
i965/cs: Allow ARB_compute_shader to be enabled via env var.

This will allow testing of compute shader functionality before it is
completed.

To enable ARB_compute_shader functionality in the i965 driver, set
INTEL_COMPUTE_SHADER=1.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agoi965/cs: Create the brw_compute_program struct, and the code to initialize it.
Paul Berry [Tue, 7 Jan 2014 23:51:13 +0000 (15:51 -0800)]
i965/cs: Create the brw_compute_program struct, and the code to initialize it.

v2: Fix comment.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agoglsl/cs: Prohibit mixing of compute and non-compute shaders.
Paul Berry [Wed, 8 Jan 2014 19:40:23 +0000 (11:40 -0800)]
glsl/cs: Prohibit mixing of compute and non-compute shaders.

Fixes piglit test:
spec/ARB_compute_shader/linker/mix_compute_and_non_compute

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agoglsl/cs: Prohibit user-defined ins/outs in compute shaders.
Paul Berry [Wed, 8 Jan 2014 09:54:26 +0000 (01:54 -0800)]
glsl/cs: Prohibit user-defined ins/outs in compute shaders.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agomain/cs: Implement query for COMPUTE_WORK_GROUP_SIZE.
Paul Berry [Thu, 9 Jan 2014 12:03:30 +0000 (04:03 -0800)]
main/cs: Implement query for COMPUTE_WORK_GROUP_SIZE.

v2: Improve error message.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agomesa/cs: Handle compute shader local size during linking.
Paul Berry [Wed, 8 Jan 2014 19:59:28 +0000 (11:59 -0800)]
mesa/cs: Handle compute shader local size during linking.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agoglsl/cs: Handle compute shader local_size_{x,y,z} declaration.
Paul Berry [Mon, 6 Jan 2014 17:09:31 +0000 (09:09 -0800)]
glsl/cs: Handle compute shader local_size_{x,y,z} declaration.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agomesa/cs: Implement MAX_COMPUTE_WORK_GROUP_COUNT constant.
Paul Berry [Wed, 8 Jan 2014 09:42:58 +0000 (01:42 -0800)]
mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_COUNT constant.

v2: Document that the 3-element array MaxComputeWorkGroupCount is
indexed by dimension.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agomesa/cs: Implement MAX_COMPUTE_WORK_GROUP_INVOCATIONS constant.
Paul Berry [Mon, 6 Jan 2014 23:11:40 +0000 (15:11 -0800)]
mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_INVOCATIONS constant.

Reviewed-by: Matt Turner <mattst88@gmail.com>
v2: Use CONTEXT_INT rather than CONTEXT_ENUM.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agomesa/cs: Implement MAX_COMPUTE_WORK_GROUP_SIZE constant.
Paul Berry [Mon, 6 Jan 2014 21:31:58 +0000 (13:31 -0800)]
mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_SIZE constant.

v2: Document that the 3-element array MaxComputeWorkGroupSize is
indexed by dimension.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agomesa/cs: Create the gl_compute_program struct, and the code to initialize it.
Paul Berry [Tue, 7 Jan 2014 23:50:39 +0000 (15:50 -0800)]
mesa/cs: Create the gl_compute_program struct, and the code to initialize it.

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agomesa/cs: Handle compute shaders in _mesa_use_program().
Paul Berry [Tue, 7 Jan 2014 17:00:02 +0000 (09:00 -0800)]
mesa/cs: Handle compute shaders in _mesa_use_program().

v2: do cs after the ordered pipeline stages for consistency.

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglsl/cs: update main.cpp to use the ".comp" extension for compute shaders.
Paul Berry [Tue, 7 Jan 2014 17:00:02 +0000 (09:00 -0800)]
glsl/cs: update main.cpp to use the ".comp" extension for compute shaders.

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglsl/cs: Populate default values for ctx->Const.Program[MESA_SHADER_COMPUTE].
Paul Berry [Tue, 7 Jan 2014 04:06:05 +0000 (20:06 -0800)]
glsl/cs: Populate default values for ctx->Const.Program[MESA_SHADER_COMPUTE].

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agomesa/cs: Add a MESA_SHADER_COMPUTE stage and update switch statements.
Paul Berry [Tue, 7 Jan 2014 04:06:05 +0000 (20:06 -0800)]
mesa/cs: Add a MESA_SHADER_COMPUTE stage and update switch statements.

This patch adds MESA_SHADER_COMPUTE to the gl_shader_stage enum.
Also, where it is trivial to do so, it adds a compute shader case to
switch statements that switch based on the type of shader.  This
avoids "unhandled switch case" compiler warnings.

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglsl/cs: Change some linker loops to use MESA_SHADER_FRAGMENT as a bound.
Paul Berry [Tue, 7 Jan 2014 03:47:25 +0000 (19:47 -0800)]
glsl/cs: Change some linker loops to use MESA_SHADER_FRAGMENT as a bound.

Linker loops that iterate through all the stages in the pipeline need
to use MESA_SHADER_FRAGMENT as a bound, so that we can add an
additional MESA_SHADER_COMPUTE stage, without it being erroneously
included in the pipeline.

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agomesa/cs: Add dispatch API stubs for ARB_compute_shader.
Paul Berry [Mon, 6 Jan 2014 23:08:04 +0000 (15:08 -0800)]
mesa/cs: Add dispatch API stubs for ARB_compute_shader.

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agomesa/cs: Add extension enable flags for ARB_compute_shader.
Paul Berry [Mon, 6 Jan 2014 17:09:07 +0000 (09:09 -0800)]
mesa/cs: Add extension enable flags for ARB_compute_shader.

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agogallivm: fix F2U opcode
Roland Scheidegger [Tue, 4 Feb 2014 18:53:53 +0000 (19:53 +0100)]
gallivm: fix F2U opcode

Previously, we were really doing F2I. And also move it to generic section.
(Note that for llvmpipe the code generated is definitely bad, due to lack
of unsigned conversions with sse. I think though what llvm does (using scalar
conversions to 64bit signed either with x87 fpu (32bit) or sse (64bit)
including lots of domain changes is quite suboptimal, could do something like
is_large = arg >= 2^31
half_arg = 0.5 * arg
small_c = fptoint(arg)
large_c = fptoint(half_arg) << 1
res = select(is_large, large_c, small_c)
which should be much less instructions but that's something llvm should do
itself.)

This fixes piglit fs/vs-float-uint-conversion.shader_test (maybe more, needs
GL 3.0 version override to run.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
10 years agotools/trace: Handle index buffer overflow gracefully.
José Fonseca [Fri, 31 Jan 2014 16:44:39 +0000 (16:44 +0000)]
tools/trace: Handle index buffer overflow gracefully.

Trivial.

10 years agodocs/GL3.txt: update r600 status
Dave Airlie [Tue, 4 Feb 2014 21:52:48 +0000 (07:52 +1000)]
docs/GL3.txt: update r600 status

This updates the r600 driver status to 3.3 being fully supported.

Signed-off-by: Dave Airlie <airlied@redhat.com>
10 years agor600g: add support for geom shaders to r600/r700 chipsets (v2)
Dave Airlie [Thu, 30 Jan 2014 04:19:57 +0000 (04:19 +0000)]
r600g: add support for geom shaders to r600/r700 chipsets (v2)

This is my first attempt at enabling r600/r700 geometry shaders,
the basic tests pass on both my rv770 and my rv635,

It requires this kernel patch:
http://www.spinics.net/lists/dri-devel/msg52745.html

v2: address Alex comments.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: enable GLSL 3.30 on evergreen GPUs
Dave Airlie [Wed, 29 Jan 2014 21:48:09 +0000 (21:48 +0000)]
r600g: enable GLSL 3.30 on evergreen GPUs

This throws the switch to enable GL 3.3 and GLSL 330.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: properly propogate clip dist write value
Dave Airlie [Tue, 4 Feb 2014 00:48:42 +0000 (10:48 +1000)]
r600g: properly propogate clip dist write value

This moves the value from the GS shader to the copy shader so the registers
are setup correctly.

fixes tests/spec/glsl-1.50/execution/geometry/clip-distance-out-values.shader_test

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: calculate a better value for array_size (v2)
Dave Airlie [Mon, 3 Feb 2014 05:31:26 +0000 (15:31 +1000)]
r600g: calculate a better value for array_size (v2)

attempt to calculate a better value for array size to avoid breaking apps.

v2: use 0xfff like streamout, suggested by Grigori

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: fix CAYMAN geometry shader support
Dave Airlie [Fri, 31 Jan 2014 03:35:51 +0000 (03:35 +0000)]
r600g: fix CAYMAN geometry shader support

cayman has a different end of program bit, so do that properly.

fixes hangs with geom shader tests on cayman.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: fix up shader out misc stuff for copy shader
Dave Airlie [Wed, 29 Jan 2014 00:17:15 +0000 (00:17 +0000)]
r600g: fix up shader out misc stuff for copy shader

set the correct values so the misc out register is setup correctly
for the copy shader.

This also updates the state for the gs copy shader so the hw
gets programmed correctly.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: port the layered surface rendering patch from radeonsi
Dave Airlie [Tue, 28 Jan 2014 23:15:29 +0000 (23:15 +0000)]
r600g: port the layered surface rendering patch from radeonsi

This just makes r600 and evergreen do what the radeonsi codepaths do
for layered rendering. This makes the 2d amd_vertex_shader_layer test
pass on evergreen.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: initial VS output layer support
Dave Airlie [Tue, 28 Jan 2014 03:04:00 +0000 (13:04 +1000)]
r600g: initial VS output layer support

This just adds support for emitting the proper value in the VS out misc.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: setup const texture buffers for geom shaders
Dave Airlie [Tue, 28 Jan 2014 02:06:49 +0000 (12:06 +1000)]
r600g: setup const texture buffers for geom shaders

This just enables the workarounds we have for vertex/pixel shaders
for geom shaders as well.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: calculate correct cut value
Dave Airlie [Fri, 24 Jan 2014 07:14:26 +0000 (17:14 +1000)]
r600g: calculate correct cut value

This selects the cut value depending on the shader selected.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: fix dynamic_input_array_index.shader_test
Dave Airlie [Fri, 24 Jan 2014 04:46:37 +0000 (14:46 +1000)]
r600g: fix dynamic_input_array_index.shader_test

This follows what fglrx does, it unpacks the input we are
going to indirect into a bunch of registers and indirects
inside them.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: add support for indirect geom ring writes
Dave Airlie [Fri, 24 Jan 2014 03:39:36 +0000 (13:39 +1000)]
r600g: add support for indirect geom ring writes

We need to be able to write to the ring using a base register
for when we emit vertices in a loop, in theory the SB compiler
could collapse these indirect writes to direct writes if the
register value is constant and known, but that is outside my
pay grade.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: write proper output prim type
Dave Airlie [Tue, 24 Dec 2013 05:59:19 +0000 (05:59 +0000)]
r600g: write proper output prim type

Vadim's code derived it from the info.mode, but it needs
to be takes from the geometry shader output primitive.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: enable instance cnt register with new enough kernel
Dave Airlie [Tue, 24 Dec 2013 05:30:37 +0000 (05:30 +0000)]
r600g: enable instance cnt register with new enough kernel

The instance cnt register was missing for a few kernels,
with a new enough kernel we can output it.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: add primitive input support for gs
Dave Airlie [Mon, 23 Dec 2013 01:30:03 +0000 (01:30 +0000)]
r600g: add primitive input support for gs

only enable prim id if gs uses it

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: emit streamout from dma copy shader
Dave Airlie [Thu, 19 Dec 2013 05:17:00 +0000 (05:17 +0000)]
r600g: emit streamout from dma copy shader

This enables streamout with GS in the mix, from the
VS dma shader.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g/gs: fix cases where number of gs inputs != number of gs outputs
Dave Airlie [Wed, 18 Dec 2013 05:55:07 +0000 (15:55 +1000)]
r600g/gs: fix cases where number of gs inputs != number of gs outputs

this fixes a bunch of the geom shader built-in tests

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: increase array base for exported parameters
Dave Airlie [Tue, 28 Jan 2014 00:21:03 +0000 (10:21 +1000)]
r600g: increase array base for exported parameters

Trivial fix to Vadim's code.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: initialise the geom shader loop registers.
Dave Airlie [Fri, 24 Jan 2014 06:41:32 +0000 (16:41 +1000)]
r600g: initialise the geom shader loop registers.

As we do for vertex and pixel shaders.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: emit NOPs at end of shaders in more cases
Dave Airlie [Sat, 30 Nov 2013 06:26:13 +0000 (06:26 +0000)]
r600g: emit NOPs at end of shaders in more cases

If the shader has no CF clauses at all emit an nop
If the last instruction is an ENDLOOP add a NOP for the LOOP to go to
if the last instruction is CALL_FS add a NOP

These fix a bunch of hangs in the geometry shader tests.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: don't enable SB for geom shaders
Dave Airlie [Thu, 28 Nov 2013 23:38:35 +0000 (23:38 +0000)]
r600g: don't enable SB for geom shaders

SB needs fixes for three GS instructions it seems to raise
them outside loops etc despite my best efforts.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g/sb: add MEM_RING support
Dave Airlie [Tue, 24 Dec 2013 04:56:25 +0000 (04:56 +0000)]
r600g/sb: add MEM_RING support

Although we don't use SB on geom shaders, the VS copy shader will use it
so we might as well implement MEM_RING support in sb.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: don't fail if we can't map VS->GS ring entries
Dave Airlie [Wed, 29 Jan 2014 04:08:43 +0000 (04:08 +0000)]
r600g: don't fail if we can't map VS->GS ring entries

This can happen in normal operation, so don't report an error on it,
just continue.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: initial support for geometry shaders on evergreen (v2)
Vadim Girlin [Fri, 2 Aug 2013 02:38:23 +0000 (06:38 +0400)]
r600g: initial support for geometry shaders on evergreen (v2)

This is Vadim's initial work with a few regression fixes squashed in.

v2: (airlied)
fix regression in glsl-max-varyings - need to use vs and ps_dirty
fix regression in shader exports from rebasing.
whitespace fixing.
v2.1: squash fix assert

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: add hw register definitions for GS block setup
Vadim Girlin [Fri, 2 Aug 2013 02:32:32 +0000 (06:32 +0400)]
r600g: add hw register definitions for GS block setup

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: defer shader variant selection and depending state updates
Vadim Girlin [Wed, 31 Jul 2013 19:09:39 +0000 (23:09 +0400)]
r600g: defer shader variant selection and depending state updates

[airlied: fix dropped streamout line - fix for master]

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g/bc: add support for indexed memory writes.
Dave Airlie [Mon, 13 Jan 2014 00:19:00 +0000 (10:19 +1000)]
r600g/bc: add support for indexed memory writes.

It looks like we need these for geom shaders in the future.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: move barrier and end_of_program bits from output to cf struct (v2)
Vadim Girlin [Wed, 31 Jul 2013 16:02:22 +0000 (20:02 +0400)]
r600g: move barrier and end_of_program bits from output to cf struct (v2)

v2: fix regression on r600 NOP instructions.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
10 years agor600g: split streamout emit code into a separate function
Dave Airlie [Wed, 29 Jan 2014 01:33:14 +0000 (01:33 +0000)]
r600g: split streamout emit code into a separate function

For geometry shaders we need to call this code from a second place.

Just move it out for now to keep future patches cleaner.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
10 years agor600g,radeonsi: skip unnecessary buffer_is_busy call, add a comment
Marek Olšák [Sat, 1 Feb 2014 14:06:39 +0000 (15:06 +0100)]
r600g,radeonsi: skip unnecessary buffer_is_busy call, add a comment

10 years agor600g,radeonsi: skip busy-checking for DISCARD_RANGE if it has been done already
Marek Olšák [Sat, 1 Feb 2014 13:59:28 +0000 (14:59 +0100)]
r600g,radeonsi: skip busy-checking for DISCARD_RANGE if it has been done already

10 years agor600g,radeonsi: treat DYNAMIC and STREAM usage as STAGING
Marek Olšák [Sat, 1 Feb 2014 13:01:20 +0000 (14:01 +0100)]
r600g,radeonsi: treat DYNAMIC and STREAM usage as STAGING

10 years agogallium: remove PIPE_CAP_MAX_COMBINED_SAMPLERS
Marek Olšák [Fri, 17 Jan 2014 21:52:28 +0000 (22:52 +0100)]
gallium: remove PIPE_CAP_MAX_COMBINED_SAMPLERS

This can be derived from the shader caps.

All GPUs from ATI/AMD, NVIDIA, and INTEL have separate texture slots
for each shader stage.

10 years agomesa: remove stray bits of GL_EXT_cull_vertex
Brian Paul [Tue, 4 Feb 2014 17:38:59 +0000 (10:38 -0700)]
mesa: remove stray bits of GL_EXT_cull_vertex

GL_EXT_cull_vertex was removed back in 2010 in commit 02984e3536
but these bits still lingered.

Reviewed-by: Eric Anholt <eric@anholt.net>
10 years agoglsl: Fix continue statements in do-while loops.
Paul Berry [Fri, 31 Jan 2014 17:55:35 +0000 (09:55 -0800)]
glsl: Fix continue statements in do-while loops.

From the GLSL 4.40 spec, section 6.4 (Jumps):

    The continue jump is used only in loops. It skips the remainder of
    the body of the inner most loop of which it is inside. For while
    and do-while loops, this jump is to the next evaluation of the
    loop condition-expression from which the loop continues as
    previously defined.

Previously, we incorrectly treated a "continue" statement as jumping
to the top of a do-while loop.

This patch fixes the problem by replicating the loop condition when
converting the "continue" statement to IR.  (We already do a similar
thing in "for" loops, to ensure that "continue" causes the loop
expression to be executed).

Fixes piglit tests:
- glsl-fs-continue-inside-do-while.shader_test
- glsl-vs-continue-inside-do-while.shader_test
- glsl-fs-continue-in-switch-in-do-while.shader_test
- glsl-vs-continue-in-switch-in-do-while.shader_test

Cc: mesa-stable@lists.freedesktop.org
Acked-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglsl: Make condition_to_hir() callable from outside ast_iteration_statement.
Paul Berry [Fri, 31 Jan 2014 17:50:37 +0000 (09:50 -0800)]
glsl: Make condition_to_hir() callable from outside ast_iteration_statement.

In addition to making it public, we also need to change its first
argument from an ir_loop * to an exec_list *, so that it can be used
to insert the condition anywhere in the IR (rather than just in the
body of the loop).

This will be necessary in order to make continue statements work
properly in do-while loops.

Cc: mesa-stable@lists.freedesktop.org
Acked-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965/blorp: do not use unnecessary hw-blending support
Topi Pohjolainen [Mon, 27 Jan 2014 08:50:01 +0000 (10:50 +0200)]
i965/blorp: do not use unnecessary hw-blending support

This is really not needed as blorp blit programs already sample
XRGB normally and get alpha channel set to 1.0 automatically by
the sampler engine. This is simply copied directly to the payload
of the render target write message and hence there is no need for
any additional blending support from the pixel processing pipeline.

The blending formula is anyway broken for color components, it
multiplies the color component with itself (blend factor is the
component itself).
Alpha blending in turn would not fix the alpha to one independent
of the source but simply used the source alpha as is instead
(1.0 * src_alpha + 0.0 * dst_alpha).

Quoting Eric:

 "If we want to actually make the no-alpha-bits-present thing work,
  we need to override the bits in the surface state or in the
  generated code.  In the normal draw path, it's done for sampling
  by the swizzling code in brw_wm_surface_state.c, and the blending
  overrides is just to fix up the alpha blending stage which
  doesn't pay attention to that for the destination surface."

If one modifies piglit test gl-3.2-layered-rendering-blit to use
color component values other than zero or one, this change will
kick in on IVB. No regressions on IVB.

This is effectively revert of c0554141a9b831b4e614747104dcbbe0fe489b9d:

    i965/blorp: Support overriding destination alpha to 1.0.

    Currently, Blorp requires the source and destination formats to be
    equal.  However, we'd really like to be able to blit between XRGB and
    ARGB formats; our BLT engine paths have supported this for a long time.

    For ARGB -> XRGB, nothing needs to occur: the missing alpha is already
    interpreted as 1.0.  For XRGB -> ARGB, we need to smash the alpha
    channel to 1.0 when writing the destination colors.  This is fairly
    straightforward with blending.

    For now, this code is never used, as the source and destination formats
    still must be equal.  The next patch will relax that restriction.

    NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
10 years agoradeon/uvd: fix feedback buffer handling v2
Christian König [Mon, 3 Feb 2014 09:28:58 +0000 (02:28 -0700)]
radeon/uvd: fix feedback buffer handling v2

Without the correct feedback buffer size UVD runs
into an error on each frame, reducing the maximum FPS.

v2: fixing Michels comments

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Cc: "10.1" "10.0" "9.2" <mesa-stable@lists.freedesktop.org>
10 years agoi965: Use brw_bo_map[_gtt]() in intel_miptree_map_raw().
Kenneth Graunke [Wed, 29 Jan 2014 17:27:09 +0000 (09:27 -0800)]
i965: Use brw_bo_map[_gtt]() in intel_miptree_map_raw().

This moves the intel_batchbuffer_flush before the drm_intel_bo_busy
call, which is a change in behavior.  However, the old behavior was
broken.

In the future, we may want to only flush in the batchbuffer references
the BO being mapped.  That's certainly more typical.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
10 years agoi965: Use brw_bo_map() in intel_texsubimage_tiled_memcpy().
Kenneth Graunke [Wed, 29 Jan 2014 17:24:32 +0000 (09:24 -0800)]
i965: Use brw_bo_map() in intel_texsubimage_tiled_memcpy().

This additionally measures the time stalled, while also simplifying the
code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
10 years agoi965: Create drm_intel_bo_map wrappers with performance warnings.
Kenneth Graunke [Wed, 29 Jan 2014 17:09:18 +0000 (09:09 -0800)]
i965: Create drm_intel_bo_map wrappers with performance warnings.

Mapping a buffer is a common place where we could stall the CPU.

In a few places, we've added special code to check whether a buffer is
busy and log the stall as a performance warning.  Most of these give no
indication of the severity of the stall, though, since measuring the
time is a small hassle.

This patch introduces a new brw_bo_map() function which wraps
drm_intel_bo_map, but additionally measures the time stalled and reports
a performance warning.  If performance debugging is not enabled, it
simply maps the buffer with negligable overhead.

We also add a similar wrapper for drm_intel_gem_bo_map_gtt().

This should make it easy to add performance warnings in lots of places.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
10 years agofreedreno: enabling binning and opt by default
Rob Clark [Mon, 3 Feb 2014 16:28:30 +0000 (11:28 -0500)]
freedreno: enabling binning and opt by default

Hw binning pass doesn't seem to have broken anything.  And optimizing
compiler fixes a lot of shaders and doesn't seem to break anything.  So
re-org slightly FD_MESA_DEBUG params and make both hw binning and
optimizer enabled by default.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx/compiler: new compiler
Rob Clark [Wed, 29 Jan 2014 22:18:49 +0000 (17:18 -0500)]
freedreno/a3xx/compiler: new compiler

The new compiler generates a dependency graph of instructions, including
a few meta-instructions to handle PHI and preserve some extra
information needed for register assignment, etc.

The depth pass assigned a weight/depth to each node (based on sum of
instruction cycles of a given node and all it's dependent nodes), which
is used to schedule instructions.  The scheduling takes into account the
minimum number of cycles/slots between dependent instructions, etc.
Which was something that could not be handled properly with the original
compiler (which was more of a naive TGSI translator than an actual
compiler).

The register assignment is currently split out as a standalone pass.  I
expect that it will be replaced at some point, once I figure out what to
do about relative addressing (which is currently the only thing that
should cause fallback to old compiler).

There are a couple new debug options for FD_MESA_DEBUG env var:

  optmsgs - enable debug prints in optimizer
  optdump - dump instruction graph in .dot format, for example:

http://people.freedesktop.org/~robclark/a3xx/frag-0000.dot.png
http://people.freedesktop.org/~robclark/a3xx/frag-0000.dot

At this point, thanks to proper handling of instruction scheduling, the
new compiler fixes a lot of things that were broken before, and does not
appear to break anything that was working before[1].  So even though it
is not finished, it seems useful to merge it in it's current state.

[1] Not merged in this commit, because I'm not sure if it really belongs
in mesa tree, but the following commit implements a simple shader
emulator, which I've used to compare the output of the new compiler to
the original compiler (ie. run it on all the TGSI shaders dumped out via
ST_DEBUG=tgsi with various games/apps):

https://github.com/freedreno/mesa/commit/163b6306b1660e05ece2f00d264a8393d99b6f12

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx/compiler: split out old compiler
Rob Clark [Wed, 29 Jan 2014 22:03:07 +0000 (17:03 -0500)]
freedreno/a3xx/compiler: split out old compiler

For the time being, keep old compiler as fallback for things that the
new compiler does not support yet.  Split out as it's own commit to make
the later new-compiler commits easier to follow.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx/compiler: prepare for new compiler
Rob Clark [Wed, 29 Jan 2014 21:25:52 +0000 (16:25 -0500)]
freedreno/a3xx/compiler: prepare for new compiler

Shuffle things around to prepare for new compiler.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx: remove useless reg tracking in disasm-a3xx
Rob Clark [Wed, 29 Jan 2014 21:13:54 +0000 (16:13 -0500)]
freedreno/a3xx: remove useless reg tracking in disasm-a3xx

Not really used for anything anymore.  So strip it out and avoid
conflicting symbols with upcoming new-compiler.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agodocs: Add release notes for 10.0.3
Carl Worth [Mon, 3 Feb 2014 21:54:50 +0000 (13:54 -0800)]
docs: Add release notes for 10.0.3

Which was just made.