mesa.git
10 years agonvc0: create the SW object
Christoph Bumiller [Fri, 7 Feb 2014 21:51:27 +0000 (22:51 +0100)]
nvc0: create the SW object

It's required for being able to use software methods now.

10 years agonvc0/ir/emit: hardcode vertex output stream to 0 for now
Christoph Bumiller [Fri, 7 Feb 2014 21:39:44 +0000 (22:39 +0100)]
nvc0/ir/emit: hardcode vertex output stream to 0 for now

10 years agoi965: Enable ARB_texture_gather for one component on Gen6.
Chris Forbes [Sun, 2 Feb 2014 09:00:18 +0000 (22:00 +1300)]
i965: Enable ARB_texture_gather for one component on Gen6.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965/vec4: Emit shader w/a for Gen6 gather
Chris Forbes [Mon, 3 Feb 2014 09:15:41 +0000 (22:15 +1300)]
i965/vec4: Emit shader w/a for Gen6 gather

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965/fs: Emit shader w/a for Gen6 gather
Chris Forbes [Mon, 3 Feb 2014 09:15:16 +0000 (22:15 +1300)]
i965/fs: Emit shader w/a for Gen6 gather

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Add surface format overrides for Gen6 gather
Chris Forbes [Mon, 3 Feb 2014 09:14:45 +0000 (22:14 +1300)]
i965: Add surface format overrides for Gen6 gather

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Add Gen6 gather wa to sampler key
Chris Forbes [Mon, 3 Feb 2014 09:13:03 +0000 (22:13 +1300)]
i965: Add Gen6 gather wa to sampler key

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoglsl: Optimize triop_csel with all-true or all-false.
Eric Anholt [Fri, 1 Nov 2013 19:29:12 +0000 (12:29 -0700)]
glsl: Optimize triop_csel with all-true or all-false.

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglsl: Optimize various cases of fma (aka MAD).
Eric Anholt [Sat, 18 Jan 2014 19:06:16 +0000 (11:06 -0800)]
glsl: Optimize various cases of fma (aka MAD).

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglsl: Optimize lrp(x, x, coefficient) --> x.
Eric Anholt [Sat, 18 Jan 2014 19:00:51 +0000 (11:00 -0800)]
glsl: Optimize lrp(x, x, coefficient) --> x.

total instructions in shared programs: 1627754 -> 1624534 (-0.20%)
instructions in affected programs:     45748 -> 42528 (-7.04%)
GAINED:                                3
LOST:                                  0

(serious sam, humus domino demo)

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglsl: Optimize pow(x, 1) -> x.
Eric Anholt [Sat, 18 Jan 2014 18:57:29 +0000 (10:57 -0800)]
glsl: Optimize pow(x, 1) -> x.

total instructions in shared programs: 1627826 -> 1627754 (-0.00%)
instructions in affected programs:     6640 -> 6568 (-1.08%)
GAINED:                                0
LOST:                                  0

(HoN and savage2)

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglsl: Optimize log(exp(x)) and exp(log(x)) into x.
Eric Anholt [Sat, 18 Jan 2014 18:47:19 +0000 (10:47 -0800)]
glsl: Optimize log(exp(x)) and exp(log(x)) into x.

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglsl: Optimize ~~x into x.
Eric Anholt [Sat, 18 Jan 2014 18:36:28 +0000 (10:36 -0800)]
glsl: Optimize ~~x into x.

v2: Fix pasteo of an extra abs being inserted (caught by many).  Rewrite
    to drop the silly switch statement.

Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)
10 years agoi965: Add some informative debug when the X Server botches DRI2 GetBuffers.
Eric Anholt [Tue, 31 Dec 2013 02:19:21 +0000 (18:19 -0800)]
i965: Add some informative debug when the X Server botches DRI2 GetBuffers.

We've had various bug reports over the years where miptrees are missing,
and when I screwed it up while adding DRI2 to the modesetting driver, I
figured I should put the info necessary for debug here.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Remove redundant check in blitter-based glBlitFramebuffer().
Eric Anholt [Mon, 30 Sep 2013 22:19:54 +0000 (15:19 -0700)]
i965: Remove redundant check in blitter-based glBlitFramebuffer().

The intel_miptree_blit() code checks the format for us now, plus it
handles xrgb vs argb for us.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Fix Gen8+ disassembly of half float subregister numbers.
Kenneth Graunke [Wed, 29 Jan 2014 22:12:51 +0000 (14:12 -0800)]
i965: Fix Gen8+ disassembly of half float subregister numbers.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoi965: Use the new brw_load_register_mem helper for draw indirect.
Kenneth Graunke [Thu, 30 Jan 2014 04:51:28 +0000 (20:51 -0800)]
i965: Use the new brw_load_register_mem helper for draw indirect.

This makes it work on Broadwell, too.

v2: Drop bogus double write to 3DPRIM_BASE_VERTEX register
    (caught by Chris Forbes).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965: Implement a brw_load_register_mem helper function.
Kenneth Graunke [Thu, 30 Jan 2014 04:43:49 +0000 (20:43 -0800)]
i965: Implement a brw_load_register_mem helper function.

This saves some boilerplate and hides the OUT_RELOC/OUT_RELOC64
distinction.

Placing the function in intel_batchbuffer.c is rather arbitrary; there
wasn't really an obvious place for it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965: Fix INTEL_DEBUG=vs for fixed-function/ARB programs.
Kenneth Graunke [Mon, 3 Feb 2014 19:13:48 +0000 (11:13 -0800)]
i965: Fix INTEL_DEBUG=vs for fixed-function/ARB programs.

Since commit 9cee3ff562f3e4b51bfd30338fd1ba7716ac5737, INTEL_DEBUG=vs
has caused a NULL pointer dereference for fixed-function/ARB programs.

In the vec4 generators, "prog" is a gl_program, and "shader_prog" is the
gl_shader_program.  This is different than the FS visitor.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl: Don't lose precision qualifiers when encountering "centroid".
Kenneth Graunke [Thu, 6 Feb 2014 05:42:00 +0000 (21:42 -0800)]
glsl: Don't lose precision qualifiers when encountering "centroid".

Mesa fails to retain the precision qualifier when parsing:

   #version 300 es
   centroid in mediump vec2 v;

Consider how the parser's type_qualifier production is applied.
First, the precision_qualifier rule creates a new ast_type_qualifier:

    <precision: mediump>

Then the storage_qualifier rule creates a second one:

    <flags: in>

and calls merge_qualifier() to fold in any previous qualifications,
returning:

    <flags: in, precision: mediump>

Finally, the auxiliary_storage_qualifier creates one for "centroid":

    <flags: centroid>

it then does $$ = $1 and $$.flags |= $2.flags, resulting in:

    <flags: centroid, in>

Since precision isn't stored in the flags bitfield, it is lost.  We need
to instead call merge_qualifier to combine all the fields.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reported-by: Kevin Rogovin <kevin.rogovin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agost/mesa: avoid sw fallback for getting/decompressing textures
Brian Paul [Fri, 7 Feb 2014 16:32:05 +0000 (09:32 -0700)]
st/mesa: avoid sw fallback for getting/decompressing textures

If st_GetTexImage() is to decompress the texture, avoid the fallback
path even if prefer_blit_based_texture_transfer = false.  For drivers
that returned PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER = 0, we
were always taking the fallback path for texture decompression rather
than rendering a quad.  The later is a lot faster.

Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
10 years agogallium/tgsi: correct typo propagated from NV_vertex_program1_1
Erik Faye-Lund [Fri, 7 Feb 2014 12:45:11 +0000 (13:45 +0100)]
gallium/tgsi: correct typo propagated from NV_vertex_program1_1

In the specification text of NV_vertex_program1_1, the upper
limit of the RCC instruction is written as 1.884467e+19 in
scientific notation, but as 0x5F800000 in binary. But the binary
version translates to 1.84467e+19 rather than 1.884467e+19 in
scientific notation.

Since the lower-limit equals 2^-64 and the binary version equals
2^+64, let's assume the value in scientific notation is a typo
and implement this using the value from the binary version
instead.

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agogallium/tgsi: use CLAMP instead of open-coded clamps
Erik Faye-Lund [Fri, 7 Feb 2014 12:45:10 +0000 (13:45 +0100)]
gallium/tgsi: use CLAMP instead of open-coded clamps

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agoegl: Unhide functionality in _eglInitSurface()
Juha-Pekka Heikkila [Fri, 7 Feb 2014 12:44:05 +0000 (14:44 +0200)]
egl: Unhide functionality in _eglInitSurface()

_eglInitResource() was used to memset entire _EGLSurface by
writing more than size of pointed target. This does work
as long as Resource is the first element in _EGLSurface,
this patch fixes such dependency.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agoegl: Unhide functionality in _eglInitContext()
Juha-Pekka Heikkila [Fri, 7 Feb 2014 12:44:04 +0000 (14:44 +0200)]
egl: Unhide functionality in _eglInitContext()

_eglInitResource() was used to memset entire _EGLContext by
writing more than size of pointed target. This does work
as long as Resource is the first element in _EGLContext,
this patch fixes such dependency.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agoglx: Add missing null check in __glX_send_client_info()
Juha-Pekka Heikkila [Fri, 7 Feb 2014 12:44:03 +0000 (14:44 +0200)]
glx: Add missing null check in __glX_send_client_info()

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agoi965: Add missing null check in fs_visitor::dead_code_eliminate_local()
Juha-Pekka Heikkila [Fri, 7 Feb 2014 12:44:02 +0000 (14:44 +0200)]
i965: Add missing null check in fs_visitor::dead_code_eliminate_local()

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agoglx: Add some missing null checks in glx_pbuffer.c
Juha-Pekka Heikkila [Fri, 7 Feb 2014 12:44:00 +0000 (14:44 +0200)]
glx: Add some missing null checks in glx_pbuffer.c

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agoglsl: Fix null access on file read error
Juha-Pekka Heikkila [Fri, 7 Feb 2014 12:43:59 +0000 (14:43 +0200)]
glsl: Fix null access on file read error

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agoglx: Add missing null check in __glXCloseDisplay
Juha-Pekka Heikkila [Fri, 7 Feb 2014 12:43:58 +0000 (14:43 +0200)]
glx: Add missing null check in __glXCloseDisplay

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agoglx: Add missing null checks in glxcmds.c
Juha-Pekka Heikkila [Fri, 7 Feb 2014 12:43:57 +0000 (14:43 +0200)]
glx: Add missing null checks in glxcmds.c

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomain/get: support ARB_gpu_shader5
Jordan Justen [Sat, 25 Jan 2014 18:55:22 +0000 (10:55 -0800)]
main/get: support ARB_gpu_shader5

If a driver enables ARB_gpu_shader5 and sets Const.MaxVertexSteams >= 4,
then piglit's arb_gpu_shader5-minmax test should now pass.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglapi: add definitions for ARB_gpu_shader5
Jordan Justen [Sat, 25 Jan 2014 18:55:21 +0000 (10:55 -0800)]
glapi: add definitions for ARB_gpu_shader5

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agonouveau/codegen: allow tex offsets on non-TXF instructions (e.g. TXL)
Ilia Mirkin [Tue, 4 Feb 2014 07:53:54 +0000 (02:53 -0500)]
nouveau/codegen: allow tex offsets on non-TXF instructions (e.g. TXL)

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
10 years agonv50: only over-allocate by a page for code
Ilia Mirkin [Tue, 4 Feb 2014 07:30:18 +0000 (02:30 -0500)]
nv50: only over-allocate by a page for code

The pre-fetching doesn't go too far. Tested with over-allocating by only
a page, and didn't see any errors in dmesg. Saves ~512KB of VRAM.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 10.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
10 years agonv50: fix layerid to be the fp input number rather than vp output number
Ilia Mirkin [Tue, 4 Feb 2014 04:35:14 +0000 (23:35 -0500)]
nv50: fix layerid to be the fp input number rather than vp output number

In the tests they were the same so it didn't matter, but indications are
that this is the correct behaviour. Also take this opportunity to
(trivially) support using gl_Layer in fp.

Cc: 10.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
10 years agonv50: rework primid logic
Ilia Mirkin [Tue, 4 Feb 2014 04:20:32 +0000 (23:20 -0500)]
nv50: rework primid logic

Functionally identical but much simpler. Should also better integrate
with future layer/viewport changes/fixes.

Cc: 10.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
10 years agoglx: Pass NULL DRI drawables into the DRI driver for None GLX drawables
Kristian Høgsberg [Wed, 5 Feb 2014 19:43:58 +0000 (11:43 -0800)]
glx: Pass NULL DRI drawables into the DRI driver for None GLX drawables

GLX_ARB_create_context allows making a GLX context current with None
drawable and readables, but this was never implemented correctly in GLX.
We would create a __DRIdrawable for the None GLX drawable and pass that
to the DRI driver and that would somehow work.  Now it's somehow broken.

The way this should have worked is that we pass a NULL DRI drawable
to the DRI driver when the GLX user calls glXMakeContextCurrent()
with None for drawable and readables.

https://bugs.freedesktop.org/show_bug.cgi?id=74143
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
10 years agost/vdpau: add flush on unmap
Christian König [Tue, 28 Jan 2014 14:22:05 +0000 (15:22 +0100)]
st/vdpau: add flush on unmap

Flush the context when we unmap a buffer, otherwise VDPAU might
start rendering the next frame while we still reference that buffer.

Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: StrangeNoises (rachel@strangenoises.org)
10 years agovdpau: flush the context before exporting the surface v2
Marek Olšák [Mon, 13 Jan 2014 13:13:01 +0000 (14:13 +0100)]
vdpau: flush the context before exporting the surface v2

Bugzilla (bug needs XBMC changes as well):
https://bugs.freedesktop.org/show_bug.cgi?id=73191

When VL uploads vertex buffers, it uses PIPE_TRANSFER_DONTBLOCK, which always
flushes the context in the winsys if the buffer being mapped is busy. Since
I added handling of DISCARD_RANGE, DONTBLOCK has had no effect when combined
with DISCARD_RANGE and I think the context isn't flushed anywhere else,
so no commands are submitted to the GPU until the IB is full, which takes
a lot of frames.

Using DISCARD_RANGE is not the only way to trigger this bug. The other way
is to reallocate the vertex buffer before every upload.

BTW, I'm not sure if this is the right place for flushing, but it does fix
the bug.

v2 (chk): move the flush to the right place.

Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: StrangeNoises (rachel@strangenoises.org)
10 years agoglsl: Initialize ubo_binding_mask flags to zero.
Matt Turner [Mon, 3 Feb 2014 19:51:51 +0000 (11:51 -0800)]
glsl: Initialize ubo_binding_mask flags to zero.

Missed in commit e63bb298. Caused sporadic test failures, like
incorrect-in-layout-qualifier-repeated-prim.geom.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agogallium/radeon: fix warnings
Marek Olšák [Thu, 6 Feb 2014 16:43:29 +0000 (17:43 +0100)]
gallium/radeon: fix warnings

10 years agogallium: remove PIPE_USAGE_STATIC
Marek Olšák [Mon, 3 Feb 2014 02:42:17 +0000 (03:42 +0100)]
gallium: remove PIPE_USAGE_STATIC

Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agogallium: define the behavior of PIPE_USAGE_* flags properly
Marek Olšák [Mon, 3 Feb 2014 02:21:29 +0000 (03:21 +0100)]
gallium: define the behavior of PIPE_USAGE_* flags properly

STATIC will be removed in the following commit.

v2: changed the definition of IMMUTABLE

Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agogallium: remove PIPE_RESOURCE_FLAG_GEN_MIPS
Marek Olšák [Mon, 3 Feb 2014 02:20:13 +0000 (03:20 +0100)]
gallium: remove PIPE_RESOURCE_FLAG_GEN_MIPS

Unused.

Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agor600g,radeonsi: set resource domains in one place (v2)
Marek Olšák [Tue, 4 Feb 2014 17:35:40 +0000 (18:35 +0100)]
r600g,radeonsi: set resource domains in one place (v2)

v2: This doesn't change the behavior. It only moves the tiling check
    to r600_init_resource and removes the usage parameter.

Reviewed-by: Christian König <christian.koenig@amd.com>
10 years agost/mesa: fix crash when a shader uses a TBO and it's not bound
Marek Olšák [Thu, 6 Feb 2014 01:16:50 +0000 (02:16 +0100)]
st/mesa: fix crash when a shader uses a TBO and it's not bound

This binds a NULL sampler view in that case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74251

Cc: "10.1" "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agost/omx: add workaround for bug in Bellagio
Christian König [Tue, 28 Jan 2014 13:21:14 +0000 (06:21 -0700)]
st/omx: add workaround for bug in Bellagio

Not blocking for the message thread can lead to accessing freed up memory.

Signed-off-by: Christian König <christian.koenig@amd.com>
10 years agost/omx: initial OpenMAX support v3
Christian König [Mon, 5 Aug 2013 17:41:27 +0000 (11:41 -0600)]
st/omx: initial OpenMAX support v3

Featuring a full grown MPEG2 and H264 decoder and a couple of hundred bugs.

v2 (Leo): fix an error for pic_order_cnt_type 1
v3 (Leo): implement support for field decoding

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Leo Liu <leo.liu@amd.com>
10 years agovl/rbsp: add H.264 RBSP implementation
Christian König [Tue, 17 Sep 2013 14:20:32 +0000 (08:20 -0600)]
vl/rbsp: add H.264 RBSP implementation

Signed-off-by: Christian König <christian.koenig@amd.com>
10 years agovl/vlc: add function to limit the vlc size
Christian König [Tue, 17 Sep 2013 13:27:38 +0000 (07:27 -0600)]
vl/vlc: add function to limit the vlc size

Signed-off-by: Christian König <christian.koenig@amd.com>
10 years agovl/vlc: add remove bits function
Christian König [Tue, 17 Sep 2013 13:22:34 +0000 (07:22 -0600)]
vl/vlc: add remove bits function

Signed-off-by: Christian König <christian.koenig@amd.com>
10 years agoradeon: update legal notes on UVD
Christian König [Mon, 3 Feb 2014 17:12:43 +0000 (10:12 -0700)]
radeon: update legal notes on UVD

Signed-off-by: Christian König <christian.koenig@amd.com>
10 years agoradeon: just don't map VRAM buffers at all
Christian König [Mon, 27 Jan 2014 10:40:25 +0000 (03:40 -0700)]
radeon: just don't map VRAM buffers at all

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
10 years agoradeon/video: directly create buffers in the right domain
Christian König [Tue, 21 Jan 2014 18:49:06 +0000 (11:49 -0700)]
radeon/video: directly create buffers in the right domain

Avoid moving things around on start of stream.

Signed-off-by: Christian König <christian.koenig@amd.com>
10 years agoradeon/video: seperate common video functions
Christian König [Thu, 17 Oct 2013 12:21:40 +0000 (06:21 -0600)]
radeon/video: seperate common video functions

Signed-off-by: Christian König <christian.koenig@amd.com>
10 years agogallium/dri2: Fix dri2_dup_image
Axel Davy [Thu, 30 Jan 2014 15:10:54 +0000 (16:10 +0100)]
gallium/dri2: Fix dri2_dup_image

dri2_dup_image was not copying the dri_format field.

This was causing some bugs, for example:
. we create an gbm_bo.
. we get an EGLImage from the gbm_bo.
. Bug: impossible to get again the gbm_bo from the EGLImage by
  importing. (gbm dri2 backend)

Signed-off-by: Axel Davy <axel.davy@ens.fr>
10 years agoi965/vs: Fix typo in brw_compute_vue_map
Chris Forbes [Sat, 25 Jan 2014 06:51:50 +0000 (19:51 +1300)]
i965/vs: Fix typo in brw_compute_vue_map

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Fix register types in dump_instructions().
Kenneth Graunke [Wed, 5 Feb 2014 21:27:15 +0000 (13:27 -0800)]
i965: Fix register types in dump_instructions().

This regressed when I converted BRW_REGISTER_TYPE_* to be an abstract
type that doesn't match the hardware description.  dump_instruction()
was using reg_encoding[] from brw_disasm.c, which no longer matches
(and was incorrect for Gen8+ anyway).

This patch introduces a new function to convert the abstract enum values
into the letter suffix we expect.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reported-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoegl/glx: Remove egl_glx driver
Chad Versace [Tue, 7 Jan 2014 20:08:30 +0000 (12:08 -0800)]
egl/glx: Remove egl_glx driver

Mesa now has a real, feature-rich EGL implementation on X11 via xcb.
Therefore I believe there is no longer a practical need for the egl_glx
driver.

Furthermore, egl_glx appears to be unmaintained.  The most recent
nontrivial commit to egl_glx was 6baa5f1 on 2011-11-25.

Tested by running weston-smoke in windowed Weston on X with i965.

Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
10 years agodocs: update 10.1 relnotes to note GL 3.3 on r600 and radeonsi.
Dave Airlie [Thu, 6 Feb 2014 01:03:09 +0000 (01:03 +0000)]
docs: update 10.1 relnotes to note GL 3.3 on r600 and radeonsi.

Signed-off-by: Dave Airlie <airlied@redhat.com>
10 years agotgsi/ureg: increase the number of immediates
Zack Rusin [Wed, 5 Feb 2014 00:33:12 +0000 (19:33 -0500)]
tgsi/ureg: increase the number of immediates

ureg_program is allocated on the heap so we can just bump the
number of immediates that it can handle. It's needed for d3d10.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agogallivm: make sure analysis works with large number of immediates
Zack Rusin [Wed, 5 Feb 2014 00:32:04 +0000 (19:32 -0500)]
gallivm: make sure analysis works with large number of immediates

We need to handle a lot more immediates and in order to do that
we also switch from allocating this structure on the stack to
allocating it on the heap.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agogallivm: handle huge number of immediates
Zack Rusin [Wed, 5 Feb 2014 00:28:58 +0000 (19:28 -0500)]
gallivm: handle huge number of immediates

We only supported up to 256 immediates, which isn't enough. We had
code which was allocating immediates as an allocated array, but it
was always used along a statically backed array for performance
reasons. This commit adds code to skip that performance optimization
and always use just the dynamically allocated immediates if the
number of them is too great.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agogallivm: allow large numbers of temporaries
Zack Rusin [Tue, 4 Feb 2014 02:40:24 +0000 (21:40 -0500)]
gallivm: allow large numbers of temporaries

The number of allowed temporaries increases almost with every
iteration of an api. We used to support 128, then we started
increasing and the newer api's support 4096+. So if we notice
that the number of temporaries is larger than our statically
allocated storage would allow we just treat them as indexable
temporaries and allocate them as an array from the start.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agoi965/fs: Assume FBO rendering in precompile if MRT.
Chris Forbes [Sat, 25 Jan 2014 22:04:42 +0000 (11:04 +1300)]
i965/fs: Assume FBO rendering in precompile if MRT.

If multiple color outputs are written, this shader is unlikely to be
useful with a winsys framebuffer.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965/fs: Guess nr_color_regions better in precompile
Chris Forbes [Sat, 25 Jan 2014 22:03:33 +0000 (11:03 +1300)]
i965/fs: Guess nr_color_regions better in precompile

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agodocs: Add relnotes for 10.2
Chris Forbes [Wed, 5 Feb 2014 21:17:17 +0000 (10:17 +1300)]
docs: Add relnotes for 10.2

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agomesa: Bump version to 10.2.0-devel
Chris Forbes [Wed, 5 Feb 2014 21:14:40 +0000 (10:14 +1300)]
mesa: Bump version to 10.2.0-devel

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965: Move intel_prepare_render() above first buffer access
Kristian Høgsberg [Wed, 5 Feb 2014 18:59:02 +0000 (10:59 -0800)]
i965: Move intel_prepare_render() above first buffer access

The driver is supposed to ensure buffers before any drawing operation, but in
do_blit_drawpixels() and do_blit_copypixels() we inspect the buffer format
before calling intel_prepare_render().  That was covered up by the
unconditional call to intel_prepare_render() in intelMakeCurrent(), but we
now only do this on the initial intelMakeCurrent call for a context
(to get the size for the initial viewport values).

https://bugs.freedesktop.org/show_bug.cgi?id=74083

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Tested-by: Alexander Monakov <amonakov@gmail.com>
10 years agost/mesa: add MESA_SHADER_COMPUTE case in shader_stage_to_ptarget()
Brian Paul [Wed, 5 Feb 2014 17:45:14 +0000 (10:45 -0700)]
st/mesa: add MESA_SHADER_COMPUTE case in shader_stage_to_ptarget()

Silences compiler warning.  Trivial.

10 years agomesa: re-wrap, fix-up comment text in formats.h
Brian Paul [Tue, 4 Feb 2014 19:19:42 +0000 (12:19 -0700)]
mesa: re-wrap, fix-up comment text in formats.h

Wrap to 78 columns, fix comment formatting.
Trivial.

10 years agoi965/cs: Allow ARB_compute_shader to be enabled via env var.
Paul Berry [Mon, 6 Jan 2014 23:12:05 +0000 (15:12 -0800)]
i965/cs: Allow ARB_compute_shader to be enabled via env var.

This will allow testing of compute shader functionality before it is
completed.

To enable ARB_compute_shader functionality in the i965 driver, set
INTEL_COMPUTE_SHADER=1.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agoi965/cs: Create the brw_compute_program struct, and the code to initialize it.
Paul Berry [Tue, 7 Jan 2014 23:51:13 +0000 (15:51 -0800)]
i965/cs: Create the brw_compute_program struct, and the code to initialize it.

v2: Fix comment.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agoglsl/cs: Prohibit mixing of compute and non-compute shaders.
Paul Berry [Wed, 8 Jan 2014 19:40:23 +0000 (11:40 -0800)]
glsl/cs: Prohibit mixing of compute and non-compute shaders.

Fixes piglit test:
spec/ARB_compute_shader/linker/mix_compute_and_non_compute

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agoglsl/cs: Prohibit user-defined ins/outs in compute shaders.
Paul Berry [Wed, 8 Jan 2014 09:54:26 +0000 (01:54 -0800)]
glsl/cs: Prohibit user-defined ins/outs in compute shaders.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agomain/cs: Implement query for COMPUTE_WORK_GROUP_SIZE.
Paul Berry [Thu, 9 Jan 2014 12:03:30 +0000 (04:03 -0800)]
main/cs: Implement query for COMPUTE_WORK_GROUP_SIZE.

v2: Improve error message.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agomesa/cs: Handle compute shader local size during linking.
Paul Berry [Wed, 8 Jan 2014 19:59:28 +0000 (11:59 -0800)]
mesa/cs: Handle compute shader local size during linking.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agoglsl/cs: Handle compute shader local_size_{x,y,z} declaration.
Paul Berry [Mon, 6 Jan 2014 17:09:31 +0000 (09:09 -0800)]
glsl/cs: Handle compute shader local_size_{x,y,z} declaration.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agomesa/cs: Implement MAX_COMPUTE_WORK_GROUP_COUNT constant.
Paul Berry [Wed, 8 Jan 2014 09:42:58 +0000 (01:42 -0800)]
mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_COUNT constant.

v2: Document that the 3-element array MaxComputeWorkGroupCount is
indexed by dimension.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agomesa/cs: Implement MAX_COMPUTE_WORK_GROUP_INVOCATIONS constant.
Paul Berry [Mon, 6 Jan 2014 23:11:40 +0000 (15:11 -0800)]
mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_INVOCATIONS constant.

Reviewed-by: Matt Turner <mattst88@gmail.com>
v2: Use CONTEXT_INT rather than CONTEXT_ENUM.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agomesa/cs: Implement MAX_COMPUTE_WORK_GROUP_SIZE constant.
Paul Berry [Mon, 6 Jan 2014 21:31:58 +0000 (13:31 -0800)]
mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_SIZE constant.

v2: Document that the 3-element array MaxComputeWorkGroupSize is
indexed by dimension.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agomesa/cs: Create the gl_compute_program struct, and the code to initialize it.
Paul Berry [Tue, 7 Jan 2014 23:50:39 +0000 (15:50 -0800)]
mesa/cs: Create the gl_compute_program struct, and the code to initialize it.

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agomesa/cs: Handle compute shaders in _mesa_use_program().
Paul Berry [Tue, 7 Jan 2014 17:00:02 +0000 (09:00 -0800)]
mesa/cs: Handle compute shaders in _mesa_use_program().

v2: do cs after the ordered pipeline stages for consistency.

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglsl/cs: update main.cpp to use the ".comp" extension for compute shaders.
Paul Berry [Tue, 7 Jan 2014 17:00:02 +0000 (09:00 -0800)]
glsl/cs: update main.cpp to use the ".comp" extension for compute shaders.

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglsl/cs: Populate default values for ctx->Const.Program[MESA_SHADER_COMPUTE].
Paul Berry [Tue, 7 Jan 2014 04:06:05 +0000 (20:06 -0800)]
glsl/cs: Populate default values for ctx->Const.Program[MESA_SHADER_COMPUTE].

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agomesa/cs: Add a MESA_SHADER_COMPUTE stage and update switch statements.
Paul Berry [Tue, 7 Jan 2014 04:06:05 +0000 (20:06 -0800)]
mesa/cs: Add a MESA_SHADER_COMPUTE stage and update switch statements.

This patch adds MESA_SHADER_COMPUTE to the gl_shader_stage enum.
Also, where it is trivial to do so, it adds a compute shader case to
switch statements that switch based on the type of shader.  This
avoids "unhandled switch case" compiler warnings.

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglsl/cs: Change some linker loops to use MESA_SHADER_FRAGMENT as a bound.
Paul Berry [Tue, 7 Jan 2014 03:47:25 +0000 (19:47 -0800)]
glsl/cs: Change some linker loops to use MESA_SHADER_FRAGMENT as a bound.

Linker loops that iterate through all the stages in the pipeline need
to use MESA_SHADER_FRAGMENT as a bound, so that we can add an
additional MESA_SHADER_COMPUTE stage, without it being erroneously
included in the pipeline.

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agomesa/cs: Add dispatch API stubs for ARB_compute_shader.
Paul Berry [Mon, 6 Jan 2014 23:08:04 +0000 (15:08 -0800)]
mesa/cs: Add dispatch API stubs for ARB_compute_shader.

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agomesa/cs: Add extension enable flags for ARB_compute_shader.
Paul Berry [Mon, 6 Jan 2014 17:09:07 +0000 (09:09 -0800)]
mesa/cs: Add extension enable flags for ARB_compute_shader.

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agogallivm: fix F2U opcode
Roland Scheidegger [Tue, 4 Feb 2014 18:53:53 +0000 (19:53 +0100)]
gallivm: fix F2U opcode

Previously, we were really doing F2I. And also move it to generic section.
(Note that for llvmpipe the code generated is definitely bad, due to lack
of unsigned conversions with sse. I think though what llvm does (using scalar
conversions to 64bit signed either with x87 fpu (32bit) or sse (64bit)
including lots of domain changes is quite suboptimal, could do something like
is_large = arg >= 2^31
half_arg = 0.5 * arg
small_c = fptoint(arg)
large_c = fptoint(half_arg) << 1
res = select(is_large, large_c, small_c)
which should be much less instructions but that's something llvm should do
itself.)

This fixes piglit fs/vs-float-uint-conversion.shader_test (maybe more, needs
GL 3.0 version override to run.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
10 years agotools/trace: Handle index buffer overflow gracefully.
José Fonseca [Fri, 31 Jan 2014 16:44:39 +0000 (16:44 +0000)]
tools/trace: Handle index buffer overflow gracefully.

Trivial.

10 years agodocs/GL3.txt: update r600 status
Dave Airlie [Tue, 4 Feb 2014 21:52:48 +0000 (07:52 +1000)]
docs/GL3.txt: update r600 status

This updates the r600 driver status to 3.3 being fully supported.

Signed-off-by: Dave Airlie <airlied@redhat.com>
10 years agor600g: add support for geom shaders to r600/r700 chipsets (v2)
Dave Airlie [Thu, 30 Jan 2014 04:19:57 +0000 (04:19 +0000)]
r600g: add support for geom shaders to r600/r700 chipsets (v2)

This is my first attempt at enabling r600/r700 geometry shaders,
the basic tests pass on both my rv770 and my rv635,

It requires this kernel patch:
http://www.spinics.net/lists/dri-devel/msg52745.html

v2: address Alex comments.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: enable GLSL 3.30 on evergreen GPUs
Dave Airlie [Wed, 29 Jan 2014 21:48:09 +0000 (21:48 +0000)]
r600g: enable GLSL 3.30 on evergreen GPUs

This throws the switch to enable GL 3.3 and GLSL 330.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: properly propogate clip dist write value
Dave Airlie [Tue, 4 Feb 2014 00:48:42 +0000 (10:48 +1000)]
r600g: properly propogate clip dist write value

This moves the value from the GS shader to the copy shader so the registers
are setup correctly.

fixes tests/spec/glsl-1.50/execution/geometry/clip-distance-out-values.shader_test

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: calculate a better value for array_size (v2)
Dave Airlie [Mon, 3 Feb 2014 05:31:26 +0000 (15:31 +1000)]
r600g: calculate a better value for array_size (v2)

attempt to calculate a better value for array size to avoid breaking apps.

v2: use 0xfff like streamout, suggested by Grigori

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: fix CAYMAN geometry shader support
Dave Airlie [Fri, 31 Jan 2014 03:35:51 +0000 (03:35 +0000)]
r600g: fix CAYMAN geometry shader support

cayman has a different end of program bit, so do that properly.

fixes hangs with geom shader tests on cayman.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: fix up shader out misc stuff for copy shader
Dave Airlie [Wed, 29 Jan 2014 00:17:15 +0000 (00:17 +0000)]
r600g: fix up shader out misc stuff for copy shader

set the correct values so the misc out register is setup correctly
for the copy shader.

This also updates the state for the gs copy shader so the hw
gets programmed correctly.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agor600g: port the layered surface rendering patch from radeonsi
Dave Airlie [Tue, 28 Jan 2014 23:15:29 +0000 (23:15 +0000)]
r600g: port the layered surface rendering patch from radeonsi

This just makes r600 and evergreen do what the radeonsi codepaths do
for layered rendering. This makes the 2d amd_vertex_shader_layer test
pass on evergreen.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>