Jason Ekstrand [Thu, 22 Jan 2015 23:49:56 +0000 (15:49 -0800)]
i965/emit: Assert that src1 is not an MRF after doing the MRF->GRF conversion
When emitting texturing from indirect texture units, we need to be able to
scratch around in the header message. Since we only do this for >= HSW,
this is ok since there are no MRFs.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj phogat <anuj.phogat@gmail.com>
Jason Ekstrand [Thu, 22 Jan 2015 21:46:44 +0000 (13:46 -0800)]
i965/emit: Do the sampler index adjustment directly in header.0.3
Prior to this commit, the adjust_sampler_state_pointer function took an
extra register that it could use as scratch space. The usual candidate was
the destination of the sampler instruction. However, if that register ever
aliased anything important such as the sampler index, this would scratch
all over important data. Fortunately, the calculation is such that we can
just do it in place and we don't need the scratch space at all.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Axel Davy [Wed, 7 Jan 2015 09:27:23 +0000 (10:27 +0100)]
st/nine: Correctly handle when ff vs should have no texture coord input/output
Previous code semantic was:
. if ff ps will not run a ff stage, then do not output texture coords for this stage
for vs
. if XYZRHW is used (position_t), use only the mode where input coordinates are copied
to the outputs.
Problem is when apps don't give texture inputs. When apps precise PASSTHRU, it means
copy texture coord input to texture coord output if there is such input. The case
where there is no texture coord input wasn't handled correctly.
Drivers like r300 dislike when vs has inputs that are not fed.
Moreover if the app uses ff vs with a programmable ps, we shouldn't look at
what are the parameters of the ff ps to decide to output or not texture
coordinates.
The new code semantic is:
. if XYZRHW is used, restrict to PASSTHRU
. if PASSTHRU is used and no texture input is declared, then do not output
texture coords for this stage
The case where ff ps needs a texture coord input and ff vs doesn't output
it is not handled, and should probably be a runtime error.
This fixes 3Dmark05, which uses ff vs with programmable ps.
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Axel Davy [Mon, 5 Jan 2015 15:26:27 +0000 (16:26 +0100)]
st/nine: Change comment relating to vertex shader inputs not matching declaration
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Axel Davy [Sat, 3 Jan 2015 10:29:40 +0000 (11:29 +0100)]
st/nine: Allocate vs constbuf buffer for indirect addressing once.
When the shader does indirect addressing on the constants,
we allocate a temporary constant buffer to which we copy
the constants from the app given user constants and
the constants filled in the shader.
This patch makes this buffer be allocated once.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Fri, 2 Jan 2015 13:38:01 +0000 (14:38 +0100)]
st/nine: Allocate the correct size for the user constant buffer
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Fri, 2 Jan 2015 13:22:17 +0000 (14:22 +0100)]
st/nine: Add variables containing the size of the constant buffers
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Sun, 7 Dec 2014 12:42:41 +0000 (13:42 +0100)]
st/nine: Fix sm3 relative addressing for non-debug build
Relative addressing needs the constant buffer to get all
the correct constants, even those defined by the shader.
The code to copy the shader constants to the constant buffer
was enabled only for debug build. Enable it always.
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Axel Davy [Sat, 6 Dec 2014 23:14:19 +0000 (00:14 +0100)]
st/nine: Remove unused code for ps
Since constant indirect adressing is not allowed for ps,
we can remove our code to handle that.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Sat, 6 Dec 2014 21:26:50 +0000 (22:26 +0100)]
st/nine: Correct rules for relative adressing and constants.
relative adressing for constants is possible only for vs float
constants.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Sun, 28 Dec 2014 13:56:02 +0000 (14:56 +0100)]
st/nine: Implement TEXREG2AR, TEXREG2GB and TEXREG2RGB
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Sun, 28 Dec 2014 13:46:01 +0000 (14:46 +0100)]
st/nine: Implement TEXDP3TEX
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Sun, 28 Dec 2014 13:42:33 +0000 (14:42 +0100)]
st/nine: Implement TEXDP3
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Sun, 28 Dec 2014 13:38:25 +0000 (14:38 +0100)]
st/nine: Implement TEXDEPTH
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Sun, 28 Dec 2014 13:26:12 +0000 (14:26 +0100)]
st/nine: Implement TEXM3x3SPEC
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Sun, 28 Dec 2014 13:21:15 +0000 (14:21 +0100)]
st/nine: Implement TEXM3x2TEX
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Sun, 28 Dec 2014 13:18:26 +0000 (14:18 +0100)]
st/nine: implement TEXM3x2DEPTH
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Sun, 28 Dec 2014 12:05:15 +0000 (13:05 +0100)]
st/nine: Fix TEXM3x3 and implement TEXM3x3VSPEC
The fix is that this line:
"src[s] = tx->regs.vT[s];" is wrong if s doesn't start from 0.
Instead access tx->regs.vT directly when needed.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Thu, 8 Jan 2015 21:21:20 +0000 (22:21 +0100)]
st/nine: Fill missing dst and src number for some instructions.
Not filling them correctly results in bad padding and later crash.
Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Wed, 3 Dec 2014 15:28:56 +0000 (16:28 +0100)]
st/nine: Implement TEXCOORD special behaviours
texcoord for ps < 1_4 should clamp between 0 and 1 the values.
texcrd (texcoord ps 1_4) does not clamp and can be used with
two modifiers _dw and _dz that means the channels are divided
by w or z.
Implement those in shared code, since the same modifiers can be used
for texld ps 1_4.
v2: replace DIV by RCP + MUL
v3: Remove an useless MOV
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Fri, 26 Dec 2014 10:14:05 +0000 (11:14 +0100)]
st/nine: Fix CALLNZ implementation
Nothing seems to indicates the negation modifier would be stored in the
instruction flags instead of the source modifier. tx_src_param has
already handled it if it is in the source modifier.
In addition,
when the card supports native integers, the boolean
are stored in 32 bits int and are equal to
0 or 0xFFFFFFFF.
Given 0xFFFFFFFF is NaN if it was a float, better use
UIF than IF.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Axel Davy [Thu, 25 Dec 2014 15:50:09 +0000 (16:50 +0100)]
st/nine: Fix some fixed function pipeline operation
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Wed, 24 Dec 2014 08:58:49 +0000 (09:58 +0100)]
st/nine: Clamp ps 1.X constants
This is wine (and windows) behaviour.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Wed, 3 Dec 2014 14:58:34 +0000 (15:58 +0100)]
st/nine: Remove duplicated code for ps texcoord input declaration
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Axel Davy [Thu, 25 Dec 2014 10:37:28 +0000 (11:37 +0100)]
st/nine: Fix CND implementation
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Fri, 2 Jan 2015 13:57:00 +0000 (14:57 +0100)]
st/nine: Match REP implementation to LOOP
Previous implementation was behaving fine, but improve it by:
. Improved documentation
. Decreasing counter (comparing to 0 is likely to be faster than to constant)
. Move the counter update at the end for better performance for shaders that
break the loop earlier than when the count is done.
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Axel Davy [Mon, 8 Dec 2014 14:38:28 +0000 (15:38 +0100)]
st/nine: Rewrite LOOP implementation, and a0 aL handling
Previous implementation didn't work well with nested loops.
Instead of using several address registers, put a0 and aL
into normal registers, and copy them to one address register when
we need to use them.
Wine tests loop_index_test() and nested_loop_test() now pass correctly.
Fixes r600g crash while loading Bioshock -
bug https://bugs.freedesktop.org/show_bug.cgi?id=85696
Tested-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Wed, 3 Dec 2014 14:50:53 +0000 (15:50 +0100)]
st/nine: Correct LOG on negative values
We should take the absolute value of the input.
Also return -FLT_MAX instead of -Inf for an input of 0.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Wed, 3 Dec 2014 14:31:44 +0000 (15:31 +0100)]
st/nine: Handle NRM with input of null norm
When the input's xyz are 0.0, the output
should be 0.0. This is due to the fact that
Inf * 0 = 0 for dx9. To handle this case,
cap the result of RSQ to FLT_MAX. We have
FLT_MAX * 0 = 0.
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Wed, 3 Dec 2014 14:28:42 +0000 (15:28 +0100)]
st/nine: Handle RSQ special cases
We should use the absolute value of the input as input to ureg_RSQ.
Moreover, an input of 0.0 should return FLT_MAX.
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Wed, 3 Dec 2014 14:14:07 +0000 (15:14 +0100)]
st/nine: Fix POW implementation
POW doesn't match directly TGSI, since we should
take the absolute value of src0.
Fixes black textures in some games
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Fri, 26 Dec 2014 10:02:08 +0000 (11:02 +0100)]
st/nine: Fix typo for M4x4
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Axel Davy [Fri, 26 Dec 2014 08:22:26 +0000 (09:22 +0100)]
st/nine: Correctly declare NineTranslateInstruction_Mkxn inputs
Let's say we have c1 and c2 declared in the shader and c0 given by the app
Then here we would have read c0, c1 and c2 given by the app, instead
of the correct c0, c1, c2.
This correction fixes several issues in some games.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Wed, 3 Dec 2014 13:47:24 +0000 (14:47 +0100)]
st/nine: Saturate oFog and oPts vs outputs
According to docs and Wine, these two vs outputs have
to be saturated.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Mon, 22 Dec 2014 17:44:06 +0000 (18:44 +0100)]
st/nine: Remove some shader unused code
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Fri, 2 Jan 2015 12:42:11 +0000 (13:42 +0100)]
st/nine: Convert integer constants to floats before storing them when cards don't support integers
The shader code is already behaving as if they are floats when the the card doesn't support integers
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Fri, 2 Jan 2015 12:00:06 +0000 (13:00 +0100)]
st/nine: Rework of boolean constants
Convert them to shader booleans at earlier stage.
Previous code is fine, but later patch will make
integers being converted at earlier stage, so do
the same for booleans
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Sun, 7 Dec 2014 17:11:40 +0000 (18:11 +0100)]
st/nine: Add ATI1 and ATI2 support
Adds ATI1 and ATI2 support to nine.
They map to PIPE_FORMAT_RGTC1_UNORM and PIPE_FORMAT_RGTC2_UNORM,
but need special handling.
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Wed, 3 Dec 2014 22:33:07 +0000 (23:33 +0100)]
st/nine: Check if srgb format is supported before trying to use it.
According to msdn, we must act as if user didn't ask srgb if we don't
support it.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Stanislaw Halik [Thu, 4 Dec 2014 15:52:22 +0000 (16:52 +0100)]
st/nine: Hack to generate resource if it doesn't exist when getting view
Buffers in the MANAGED pool are supposed to have the content in a ram buffer,
a copy in VRAM if there is enough memory (driver manages memory and decide when
to delete the buffer in VRAM).
This is not implemented properly in nine, and a VRAM copy is going to be created
when the RAM memory is filled, and the VRAM copy will get synced with the RAM
memory updates.
Due to some issues (in the implementation or in app logic), it can happen
we try to create a sampler view of the resource while we haven't created the
VRAM resource. This hack creates the resource when we hit this case, which prevents
crashing, but doesn't help with the resource content.
This fixes several games crashing at launch.
Acked-by: Axel Davy <axel.davy@ens.fr>
Acked-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Stanislaw Halik <sthalik@misaki.pl>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Tue, 2 Dec 2014 22:33:37 +0000 (23:33 +0100)]
st/nine: NineBaseTexture9: update sampler view creation
While previous code was having the correct behaviour in general,
this new code is more readable (without checking all gallium formats
manually) and has a more defined behaviour for depth stencil resources.
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Sun, 7 Dec 2014 15:51:49 +0000 (16:51 +0100)]
st/nine: Return D3DERR_INVALIDCALL when trying to create a texture of bad format
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Axel Davy [Tue, 2 Dec 2014 23:07:26 +0000 (00:07 +0100)]
st/nine: Fix crash when deleting non-implicit swapchain
The implicit swapchains are destroyed when the device instance is
destroyed. However for non-implicit swapchains, it is not the case,
and the application can have kept an reference on the swapchain
buffers to reuse them.
Fixes problems with battle.net launcher.
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: Nick Sarnie <commendsarnex@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Axel Davy [Tue, 2 Dec 2014 21:44:37 +0000 (22:44 +0100)]
st/nine: CubeTexture: fix GetLevelDesc
This->surfaces contains the surfaces associated to the levels
and faces. This->surfaces[6*Level] is what we want here,
since it gives us a face descriptor for the level 'Level'.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Tue, 2 Dec 2014 21:18:30 +0000 (22:18 +0100)]
st/nine: NineBaseTexture9: fix setting of last_layer
Use same similar settings as u_sampler_view_default_template
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Thu, 25 Dec 2014 10:04:10 +0000 (11:04 +0100)]
st/nine: Correctly advertise D3DPMISCCAPS_CLIPTLVERTS
The cap means D3DFVF_XYZRHW vertices will see clipping.
This is not the case when
PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION is supported, since
it'll disable clipping.
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Xavier Bouchoux [Wed, 17 Dec 2014 22:10:04 +0000 (23:10 +0100)]
st/nine: Fix D3DRS_POINTSPRITE support
It's done by testing the existence of the point sprite output register *after* parsing the vertex shader.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Sun, 7 Dec 2014 15:46:28 +0000 (16:46 +0100)]
st/nine: Add new texture format strings
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Xavier Bouchoux [Mon, 8 Dec 2014 22:28:28 +0000 (23:28 +0100)]
st/nine: Add missing c++ declaration for IDirect3DVolumeTexture9
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Xavier Bouchoux [Mon, 8 Dec 2014 22:31:13 +0000 (23:31 +0100)]
st/nine: Additional defines to d3dtypes.h
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Axel Davy [Sun, 21 Dec 2014 12:03:47 +0000 (13:03 +0100)]
st/nine: Fix clip state logic
The clip state was reset everytime, incurring an overhead.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
David Heidelberger [Sun, 28 Dec 2014 01:44:53 +0000 (02:44 +0100)]
st/nine: query: remove unused variable (trivial)
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: David Heidelberg <david@ixit.cz>
Eric Anholt [Wed, 21 Jan 2015 23:32:48 +0000 (15:32 -0800)]
nir: Fix setup of constant bool initializers.
brw_fs_nir has only seen scalar bools so far, thanks to vector splitting,
and the ralloc of in glsl_to_nir.cpp will *usually* get you a 0-filled
chunk of memory, so reading too large of a value will usually get you the
right bool value. But once we start doing vector bools in a few commits,
we end up getting bad values.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Eric Anholt [Wed, 21 Jan 2015 00:23:51 +0000 (16:23 -0800)]
nir: Make an easier helper for setting up SSA defs.
Almost all instructions we nir_ssa_def_init() for are nir_dests, and you
have to keep from forgetting to set is_ssa when you do. Just provide the
simpler helper, instead.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Jonathan Gray [Fri, 9 Jan 2015 01:33:17 +0000 (12:33 +1100)]
glsl: Link glsl_test with pthreads library.
Otherwise pthread_mutex_lock will be an undefined reference
on OpenBSD.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88219
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Vinson Lee [Mon, 19 Jan 2015 20:54:44 +0000 (12:54 -0800)]
scons: Add X11 include path if X11 is available.
Mac OS X XQuartz places X11 headers at /opt/X11/include.
This patch fixes this Mac OS X SCons build error.
Compiling src/gallium/state_trackers/glx/xlib/glx_api.c ...
In file included from src/gallium/state_trackers/glx/xlib/glx_api.c:34:
include/GL/glx.h:30:10: fatal error: 'X11/Xlib.h' file not found
^
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
José Fonseca [Thu, 22 Jan 2015 20:05:48 +0000 (20:05 +0000)]
meta: Move loop declaration to top of block.
Fixes MSVC build.
Trvial.
Jason Ekstrand [Tue, 13 Jan 2015 00:22:30 +0000 (16:22 -0800)]
i965/tex_subimage: use meta instead of the blitter for PBO TexSubImage
Reviewed-by: Neil Roberts <neil@linux.intel.com>
Jason Ekstrand [Tue, 13 Jan 2015 00:21:17 +0000 (16:21 -0800)]
i965/tex_image: Use meta for instead of the blitter PBO TexImage and GetTexImage
Reviewed-by: Neil Roberts <neil@linux.intel.com>
Jason Ekstrand [Tue, 13 Jan 2015 00:20:27 +0000 (16:20 -0800)]
i965/pixel_read: Use meta_pbo_GetTexSubImage for PBO ReadPixels
Since the meta path can do strictly more than the blitter path, we just
remove the blitter path entirely.
Reviewed-by: Neil Roberts <neil@linux.intel.com>
Jason Ekstrand [Mon, 12 Jan 2015 23:39:59 +0000 (15:39 -0800)]
meta: Add an implementation of GetTexSubImage for PBOs
Reviewed-by: Neil Roberts <neil@linux.intel.com>
Jason Ekstrand [Tue, 6 Jan 2015 02:17:04 +0000 (18:17 -0800)]
meta: Add a BlitFramebuffers-based implementation of TexSubImage
This meta path, designed for use with PBO's, creates a temporary texture
out of the PBO and uses BlitFramebuffers to do the actual texture upload.
v2 Jason Ekstrand <jason.ekstrand@intel.com>:
- Add support for handling simple packing options
v3 Jason Ekstrand <jason.ekstrand@intel.com>:
- Refactor to split out the texture-from-pbo code
- Rename to _mesa_meta_pbo_TexSubImage
Reviewed-by: Neil Roberts <neil@linux.intel.com>
Jason Ekstrand [Mon, 12 Jan 2015 22:43:34 +0000 (14:43 -0800)]
formats: Use a hash table for _mesa_format_from_array_format
Going through the for loop every time has noticable overhead. This fixes
things up so we only do that once ever and then just do a hash table lookup
which should be much cheaper.
v2 Jason Ekstrand <jason.ekstrand@intel.com>:
- Use once_flag and call_once from c11/threads.h instead of pthreads
Reviewed-by: Neil Roberts <neil@linux.intel.com>
Jason Ekstrand [Thu, 8 Jan 2015 05:13:49 +0000 (21:13 -0800)]
i965: Implement SetTextureStorageForBufferObject
Reviewed-by: Neil Roberts <neil@linux.intel.com>
Jason Ekstrand [Tue, 13 Jan 2015 17:50:37 +0000 (09:50 -0800)]
i965: Apply the miptree offset to surface state for renderbuffers
Previously, we were completely ignoring the mt->offset field for
renderbuffers. While it does have some alignment constraints, it is valid
to use it. This patch adds the code to each of the 4 surface state setup
functions to handle it.
Reviewed-by: Neil Roberts <neil@linux.intel.com>
Jason Ekstrand [Thu, 8 Jan 2015 20:23:46 +0000 (12:23 -0800)]
i965/mipmap_tree: Add a depth parameter to create_for_bo
Reviewed-by: Neil Roberts <neil@linux.intel.com>
Jason Ekstrand [Thu, 8 Jan 2015 05:13:15 +0000 (21:13 -0800)]
mesa/dd: Add a function for creating a texture from a buffer object
Reviewed-by: Neil Roberts <neil@linux.intel.com>
Tapani Pälli [Mon, 19 Jan 2015 10:28:17 +0000 (12:28 +0200)]
glsl: do not allow interface block to have name already taken
Fixes currently failing Piglit case
interface-blocks-name-reused-globally.vert
v2: combine var declaration with assignment (Ian)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Thu, 22 Jan 2015 04:22:18 +0000 (20:22 -0800)]
nir: Replace assert(0) with unreachable().
Fixes a couple of warnings in the process.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Matt Turner [Thu, 22 Jan 2015 04:16:38 +0000 (20:16 -0800)]
i965/vec4: Fix fprintf argument ordering.
Introduced in commit
3167a80b.
Jason Ekstrand [Wed, 21 Jan 2015 19:11:03 +0000 (11:11 -0800)]
nir: Stop using designated initializers
Designated initializers with anonymous unions don't work in MSVC or
GCC < 4.6. With a couple of constructor methods, we don't need them any
more and the code is actually cleaner.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88467
Reviewed-by: Connor Abbot <cwabbott0@gmail.com>
Tobias Klausmann [Mon, 19 Jan 2015 20:51:38 +0000 (21:51 +0100)]
mesa: change assert to unreachable in two format functions
This fixes two problems reported by osc:
I: Program returns random data in a function
E: Mesa no-return-in-nonvoid-function ../../src/mesa/main/format_utils.c:180
E: Mesa no-return-in-nonvoid-function ../../src/mesa/main/glformats.c:2714
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Jason Ekstrand [Wed, 21 Jan 2015 19:10:11 +0000 (11:10 -0800)]
nir: Add src and dest constructors
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Jan Vesely [Wed, 14 Jan 2015 21:12:06 +0000 (16:12 -0500)]
mesa: Add assert to check number of vector elements
The below code crashes when vector_elements <= 0
Fixes Warray-bounds warnings
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Jan Vesely [Thu, 15 Jan 2015 18:41:04 +0000 (13:41 -0500)]
mesa: Fix some signed-unsigned comparison warnings
v2: s/unsigned int/unsigned/ in prog_optimize.c
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Jan Vesely [Wed, 14 Jan 2015 20:53:18 +0000 (15:53 -0500)]
mesa: remove comparisons that are always true
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Jason Ekstrand [Wed, 21 Jan 2015 00:30:14 +0000 (16:30 -0800)]
nir: Add a nir_foreach_phi_src helper macro
Reviewed-by: Connor Abbott <cwabbott02gmail.com>
Ben Widawsky [Tue, 23 Dec 2014 03:29:22 +0000 (19:29 -0800)]
i965: Extract scalar region checking logic
There are currently 2 users of this functionality. I have 2 more users coming
up, and having a simple function makes the results much cleaner. The existing
interface semantics was proposed by Matt.
v2 (Ken): Rename to region_matches()/has_scalar_region().
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Ben Widawsky [Tue, 23 Dec 2014 03:29:13 +0000 (19:29 -0800)]
i965: Add QWORD sizes to type_sz macro
GEN8 added the QWORD as a valid type for certain operations on the EU.
In order to calculate the number of registers used one must have the type
size as part of the equation. Quoting the formula in the code:
regs_written = (dst.width * dst.stride * type_sz(dst.type) + 31) / 32;
Adding this separately for bisection since there is no simple way to add
an assert in the type_sz function.
NOTE: As a side note, I was confused for a while because it's impossible
to calculate the region, ie. registers needed, without vstride. However,
at this point these are all part of the IR, and so no vstride must exist.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Tue, 20 Jan 2015 22:19:29 +0000 (14:19 -0800)]
Rob Clark [Sun, 18 Jan 2015 19:47:15 +0000 (14:47 -0500)]
freedreno/a4xx: sysmem bypass
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Sun, 18 Jan 2015 23:10:02 +0000 (18:10 -0500)]
freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Tom Stellard [Wed, 7 Jan 2015 20:51:48 +0000 (15:51 -0500)]
radeonsi: Re-enable LLVM IR dumps
This was inadvertently disabled by
761e36b4caab4e8e09a4c2b1409a825902fc7d2c.
Tom Stellard [Wed, 10 Dec 2014 01:05:44 +0000 (20:05 -0500)]
radeonsi/compute: Use relocs for scratch pointer rather than user sgprs v2
Instead of passing a pointer to the scratch buffer via user sgprs, we
now patch the shader with the buffer address using reloc information
from the LLVM generated ELF.
v2:
- Make sure not to break older LLVM.
Tom Stellard [Wed, 10 Dec 2014 01:03:50 +0000 (20:03 -0500)]
radeon: Teach radeon_elf_read() how to parse reloc information v3
v2:
- Use strdup for copying reloc names.
- Free reloc memory.
v3:
- Add free_relocs parameter to radeon_shader_binary_free_members()
Tom Stellard [Wed, 14 Jan 2015 15:01:29 +0000 (10:01 -0500)]
radeon: Add a helper function for freeing members of radeon_shader_binary
Kenneth Graunke [Sun, 18 Jan 2015 07:21:15 +0000 (23:21 -0800)]
i965: Work around mysterious Gen4 GPU hangs with minimal state changes.
Gen4 hardware appears to GPU hang frequently when using Chromium, and
also when running 'glmark2 -b ideas'. Most of the error states contain
3DPRIMITIVE commands in quick succession, with very few state packets
between them - usually VERTEX_BUFFERS/ELEMENTS and CONSTANT_BUFFER.
I trimmed an apitrace of the glmark2 hang down to two draw calls with a
glUniformMatrix4fv call between the two. Either draw by itself works
fine, but together, they hang the GPU. Removing the glUniform call
makes the hangs disappear. In the hardware state, this translates to
removing the CONSTANT_BUFFER packet between the two 3DPRIMITIVE packets.
Flushing before emitting CONSTANT_BUFFER packets also appears to make
the hangs disappear. I observed a slowdown in glxgears by doing it all
the time, so I've chosen to only do it when BRW_NEW_BATCH and
BRW_NEW_PSP are unset (i.e. we haven't done a CS_URB_STATE change or
already flushed the whole pipeline).
I'd much rather understand the problem, but at this point, I don't see
how we'd ever be able to track it down further. We have no real tools,
and the hardware people moved on years ago. I've analyzed 20+ error
states and read every scrap of documentation I could find.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80568
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85367
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Matt Turner <mattst88@gmail.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Kenneth Graunke [Fri, 16 Jan 2015 09:40:33 +0000 (01:40 -0800)]
i965/nir: Enable SIMD16 support in the NIR FS backend.
With the previous commits in place, it just works.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Fri, 16 Jan 2015 21:16:18 +0000 (13:16 -0800)]
i965/nir: Use offset() instead of altering reg_offset directly.
offset() properly handles reg_width, so it'll work for SIMD16.
While we're in the area, simplify a few cases, and use retype() to cut a
few more lines of code.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Fri, 16 Jan 2015 10:12:17 +0000 (02:12 -0800)]
i965/nir: Replace fs_reg(GRF, virtual_grf_alloc(...)) with vgrf(...).
brw_fs_nir.cpp creates almost all of its registers via:
fs_reg reg = fs_reg(GRF, virtual_grf_alloc(num_components));
When we add SIMD16 support, we'll need to set reg->width = 16 and
double the VGRF size...on pretty much every VGRF it allocates.
This patch replaces that pattern with a new "vgrf" helper method:
fs_reg reg = vgrf(num_components);
The new function correctly takes reg_width into account. For now,
reg_width is always 1, so this should have no functional change.
v2: Just make vgrf() account for reg_width right away, rather than
changing the behavior in the next patch.
v3: Replace one last virtual_grf_alloc I missed. It's used in code
that only runs for dispatch_width == 8, so it doesn't matter,
but consistency is nice.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Kenneth Graunke [Fri, 16 May 2014 09:21:51 +0000 (02:21 -0700)]
i965: Replace fs_reg(fs_visitor, type) with fs_visitor::vgrf(type).
I dislike how fs_reg has a constructor that knows about fs_visitor.
Apart from that, it stands alone, with no need to interact with the
rest of the compiler. Which is sensible - a class that represents
a register should do just that. Allocating virtual register numbers
should be left up to the compiler (fs_visitor).
This patch replaces the constructor with a new fs_visitor::vgrf method,
eliminating fs_reg's dependency on fs_visitor. It ends up being no
more code.
v2: Rebase from May 2014 -> January 2015.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Marek Olšák [Sun, 11 Jan 2015 17:38:03 +0000 (18:38 +0100)]
st/mesa: don't set vs.key.clamp_color if a shader doesn't write any colors
And update some comments.
Marek Olšák [Mon, 12 Jan 2015 13:35:24 +0000 (14:35 +0100)]
winsys/radeon: increase the size of buffer cache
This should fix this performance regression:
https://bugs.freedesktop.org/show_bug.cgi?id=88227
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Carl Worth [Mon, 19 Jan 2015 18:49:41 +0000 (10:49 -0800)]
Rename sha1.c and sha1.h to mesa-sha1.c and mesa-sha1.h
The filename of sha1.h was conflicting with the system-provided
sha1.h, (and in some confiurations, our sha1.c was unsuccessfully
attemping to include "sha1.h" and <sha1.h> as two different files).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88523
Martin Peres [Mon, 19 Jan 2015 08:52:05 +0000 (10:52 +0200)]
mesa: fix a trivial spelling mistake
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tapani Pälli [Fri, 16 Jan 2015 10:48:43 +0000 (12:48 +0200)]
mesa: support GL_RGB for GL_EXT_texture_type_2_10_10_10_REV
Commit
8ec6534 changed texture upload path and the way how texture
format is being checked, this commit adds support for GL_RGB with
GL_UNSIGNED_INT_2_10_10_10_REV as specified by the extension
EXT_texture_type_2_10_10_10_REV specification.
This fixes regression in ES3 conformance test
ES3-CTS.gtf.GL3Tests.packed_pixels.packed_pixels
v2: add MESA_FORMAT_R10G10B10X2_UNORM format (Iago Toral)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88385
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Micah Fedke [Wed, 31 Dec 2014 20:16:52 +0000 (14:16 -0600)]
mesa: Add ARB_shader_precision infrastructure
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Kenneth Graunke [Sat, 17 Jan 2015 09:01:35 +0000 (01:01 -0800)]
i965/fs: Fix the dummy fragment shader.
We hit an assertion that the destination of the FB write should not be
an immediate. (I don't know what we were thinking.) Use ARF null.
Trying to substitute real shaders with the dummy shader would crash
when trying to upload non-existent uniforms. Say there are none.
It also wouldn't generate any code because we didn't compute the CFG,
and code generation now requires it. Compute it.
Gen4-5 also require a message header to be present.
On Gen6+, there were assertion failures in SF/SBE state because
urb_setup was memset to 0 instad of -1, causing it to think there were
attributes when nothing was set up right. Set to no attributes.
Finally, you have to ensure "Setup URB Entry Read Length" is non-zero
or you get GPU hangs, at least on Crestline.
It now works on at least Crestline and Haswell.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Kristian Høgsberg [Sat, 17 Jan 2015 05:54:54 +0000 (21:54 -0800)]
gbm: Define _DEFAULT_SOURCE to avoid warning
glibc 2.19 introduced _DEFUAULT_SOURCE as a replacement for _BSD_SOURCE,
and deprecates _BSD_SOURCE with an annoying warning. Defining both is
how you're supposed to transition so let's do that. It gets rid of the
warning and we can figure out when/if we can drop _BSD_SOURCE later.
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Vinson Lee [Sat, 17 Jan 2015 00:21:41 +0000 (16:21 -0800)]
sha1: Fix gcry_md_hd_t typo.
Fix build error.
CC libmesautil_la-sha1.lo
sha1.c: In function '_mesa_sha1_final':
sha1.c:210:22: error: 'grcy_md_hd_t' undeclared (first use in this function)
gcry_md_hd_t h = (grcy_md_hd_t) ctx;
^
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88519
Signed-off-by: Vinson Lee <vlee@freedesktop.org>