Keith Whitwell [Mon, 4 Oct 2010 14:00:34 +0000 (15:00 +0100)]
gallivm: special case conversion 4x4f to 1x16ub
Nice reduction in the number of operations required for final color
output in many shaders.
Keith Whitwell [Fri, 8 Oct 2010 16:06:05 +0000 (17:06 +0100)]
llvmpipe: avoid overflow in triangle culling
Avoid multiplying fixed-point values. Calculate triangle area in
floating point use that for culling.
Lift area calculations up a level as we are already doing this in the
triangle_both() case.
Would like to share the calculated area with attribute interpolation,
but the way the code is structured makes this difficult.
Keith Whitwell [Fri, 8 Oct 2010 16:01:16 +0000 (17:01 +0100)]
llvmpipe: fail gracefully on oom in scene creation
José Fonseca [Fri, 8 Oct 2010 14:50:28 +0000 (15:50 +0100)]
gallivm: Implement brilinear filtering.
José Fonseca [Fri, 8 Oct 2010 13:09:22 +0000 (14:09 +0100)]
gallivm: Fix copy'n'paste typo in previous commit.
José Fonseca [Fri, 8 Oct 2010 13:05:50 +0000 (14:05 +0100)]
gallivm: Clamp mipmap level and zero mip weight simultaneously.
José Fonseca [Fri, 8 Oct 2010 12:36:18 +0000 (13:36 +0100)]
gallivm: Use lp_build_ifloor_fract for lod computation.
Forgot this one before.
José Fonseca [Fri, 8 Oct 2010 12:26:37 +0000 (13:26 +0100)]
gallivm: Don't compute the second mipmap level when frac(lod) == 0
José Fonseca [Fri, 8 Oct 2010 12:24:39 +0000 (13:24 +0100)]
gallivm: Simplify lp_build_mipmap_level_sizes' interface.
José Fonseca [Fri, 8 Oct 2010 09:54:23 +0000 (10:54 +0100)]
gallivm: Do not do mipfiltering when magnifying.
If lod < 0, then invariably follows that ilevel0 == ilevel1 == 0.
Vinson Lee [Fri, 8 Oct 2010 11:56:49 +0000 (04:56 -0700)]
r600g: Remove unnecessary header.
Dave Airlie [Fri, 8 Oct 2010 09:55:05 +0000 (19:55 +1000)]
r600g: drop width/height per level storage.
these aren't used anywhere, so just waste memory.
Eric Anholt [Wed, 6 Oct 2010 05:30:42 +0000 (22:30 -0700)]
i965: Normalize cubemap coordinates like is done in the Mesa IR path.
Fixes glsl-fs-texturecube-2-*
Eric Anholt [Thu, 7 Oct 2010 16:13:09 +0000 (09:13 -0700)]
i965: Disable emitting if () statements on gen6 until we really fix them.
Dave Airlie [Thu, 7 Oct 2010 23:35:17 +0000 (09:35 +1000)]
r600g: add some RG texture format support.
Kristian Høgsberg [Thu, 7 Oct 2010 21:03:53 +0000 (17:03 -0400)]
gles2: Add GL_EXT_texture_format_BGRA8888 support
José Fonseca [Thu, 7 Oct 2010 21:03:59 +0000 (22:03 +0100)]
gallivm: Vectorize the rho computation.
Dave Airlie [Thu, 7 Oct 2010 05:32:05 +0000 (15:32 +1000)]
r600g: fix Z export enable bits.
we should be checking output array not input to decide.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 7 Oct 2010 05:13:09 +0000 (15:13 +1000)]
r600g: use format from the sampler view not from the texture.
we want to use the format from the sampler view which isn't always the
same as the texture format when creating sampler views.
Andre Maasikas [Wed, 6 Oct 2010 18:14:15 +0000 (21:14 +0300)]
r600g: fix evergreen interpolation setup
interp data is stored in gpr0 so first interp overwrote it
and subsequent ones got wrong values
reserve register 0 so it's not used for attribs.
alternative is to interpolate attrib0 last (reverse, as r600c does)
Chia-I Wu [Thu, 7 Oct 2010 04:14:38 +0000 (12:14 +0800)]
st/vega: Fix version check in context creation.
This fixes a regression since
4531356817ec8383ac35932903773de67af92e37.
Chia-I Wu [Thu, 7 Oct 2010 04:06:07 +0000 (12:06 +0800)]
targets/egl: Fix linking with libdrm.
Eric Anholt [Thu, 7 Oct 2010 00:29:29 +0000 (17:29 -0700)]
i965: Fix gen6 pointsize handling to match pre-gen6.
Fixes point-line-no-cull.
Bug #30532
Eric Anholt [Wed, 6 Oct 2010 19:10:31 +0000 (12:10 -0700)]
i965: Don't assume that WPOS is always provided on gen6 in the new FS.
We sensibly only provide it if the FS asks for it. We could actually
skip WPOS unless the FS needed WPOS.zw, but that's something for
later.
Fixes: glsl-texture2d and probably many others.
Eric Anholt [Wed, 6 Oct 2010 18:19:48 +0000 (11:19 -0700)]
i965: Add support for gl_FrontFacing on gen6.
Fixes glsl1-gl_FrontFacing var (2) with new FS.
Eric Anholt [Wed, 6 Oct 2010 18:13:22 +0000 (11:13 -0700)]
i965: Refactor gl_FrontFacing setup out of general variable setup.
Eric Anholt [Wed, 6 Oct 2010 18:04:02 +0000 (11:04 -0700)]
i965: Gen6's sampler messages are the same as Ironlake.
This should fix texturing in the new FS backend.
Eric Anholt [Wed, 6 Oct 2010 18:00:31 +0000 (11:00 -0700)]
i965: Don't do 1/w multiplication in new FS for gen6
Not needed now that we're doing barycentric.
Eric Anholt [Wed, 6 Oct 2010 17:22:22 +0000 (10:22 -0700)]
i965: Add some clarification of the WECtrl field.
Eric Anholt [Wed, 6 Oct 2010 18:25:05 +0000 (11:25 -0700)]
i965: Fix botch in the header_present case in the new FS.
I only set it on the color_regions == 0 case, missing the important
case, causing GPU hangs on pre-gen6.
José Fonseca [Wed, 6 Oct 2010 09:11:15 +0000 (10:11 +0100)]
llvmpipe: Cleanup depth-stencil clears.
Only cosmetic changes. No actual practical difference.
José Fonseca [Wed, 6 Oct 2010 09:09:37 +0000 (10:09 +0100)]
util: Cleanup util_pack_z_stencil and friends.
- Handle PIPE_FORMAT_Z32_FLOAT packing correctly.
- In the integer version z shouldn't be passed as as double.
- Make it clear that the integer versions should only be used for masks.
- Make integer type sizes explicit (uint32_t for now, although
uint64_t will be necessary later to encode f32_s8_x24).
José Fonseca [Wed, 6 Oct 2010 17:44:51 +0000 (18:44 +0100)]
gallivm: Compute lod as integer whenever possible.
More accurate/faster results for PIPE_TEX_MIPFILTER_NEAREST. Less
FP <-> SI conversion overall.
José Fonseca [Wed, 6 Oct 2010 13:53:19 +0000 (14:53 +0100)]
gallivm: Only apply min/max_lod when necessary.
Keith Whitwell [Thu, 30 Sep 2010 15:43:56 +0000 (16:43 +0100)]
gallivm: don't apply zero lod_bias
José Fonseca [Wed, 6 Oct 2010 17:31:36 +0000 (18:31 +0100)]
gallivm: Combined ifloor & fract helper.
The only way to ensure we don't do redundant FP <-> SI conversions.
José Fonseca [Wed, 6 Oct 2010 16:44:05 +0000 (17:44 +0100)]
gallivm: Fast implementation of iround(log2(x))
Not tested yet, but should be correct.
José Fonseca [Wed, 6 Oct 2010 13:06:14 +0000 (14:06 +0100)]
gallivm: Use a faster (and less accurate) log2 in lod computation.
José Fonseca [Wed, 6 Oct 2010 11:09:32 +0000 (12:09 +0100)]
gallivm: Take the type signedness in consideration in round/ceil/floor.
Eric Anholt [Mon, 4 Oct 2010 22:08:03 +0000 (15:08 -0700)]
i965: Fix up IF/ELSE/ENDIF for gen6.
The jump delta is now in the part of the instruction where the
destination fields used to be, and the src args are ignored (or not,
for the new non-predicated IF that we don't use yet).
Eric Anholt [Mon, 4 Oct 2010 22:09:18 +0000 (15:09 -0700)]
i965: Gen6 no longer has the IFF instruction; always use IF.
Eric Anholt [Wed, 6 Oct 2010 16:57:55 +0000 (09:57 -0700)]
i965: Add back gen6 headerless FB writes to the new FS backend.
It's not that hard to detect when we need the header.
Jerome Glisse [Wed, 6 Oct 2010 16:56:53 +0000 (12:56 -0400)]
r600g: fix dirty state handling
Avoid having object ending up in dead list of dirty object.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Eric Anholt [Tue, 5 Oct 2010 17:25:22 +0000 (10:25 -0700)]
i965: Also do constant propagation for the second operand of CMP.
We could do the first operand as well by flipping the comparison, but
this covered several CMPs in code I was looking at.
Eric Anholt [Tue, 5 Oct 2010 17:20:16 +0000 (10:20 -0700)]
i965: Enable the constant propagation code.
A debug disable had slipped in.
Jerome Glisse [Wed, 6 Oct 2010 13:40:27 +0000 (09:40 -0400)]
r600g: avoid segfault due to unintialized list pointer
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
José Fonseca [Wed, 6 Oct 2010 08:40:51 +0000 (09:40 +0100)]
llvmpipe: Fix sprite coord perspective interpolation of Q.
Q coordinate's coefficients also need to be multiplied by w, otherwise
it will have 1/w, causing problems with TXP.
José Fonseca [Sun, 26 Sep 2010 10:22:20 +0000 (11:22 +0100)]
llvmpipe: Fix perspective interpolation for point sprites.
Once a fragment is generated with LP_INTERP_PERSPECTIVE set for an input,
it will do a divide by w for that input. Therefore it's not OK to treat LP_INTERP_PERSPECTIVE as
LP_INTERP_LINEAR or vice-versa, even if the attribute is known to not
vary.
A better strategy would be to take the primitive in consideration when
generating the fragment shader key, and therefore avoid the per-fragment
perspective divide.
José Fonseca [Tue, 5 Oct 2010 09:50:02 +0000 (10:50 +0100)]
llvmpipe: Dump a few missing shader key flags.
Keith Whitwell [Sun, 3 Oct 2010 10:39:02 +0000 (11:39 +0100)]
llvmpipe: make debug_fs_variant respect variant->nr_samplers
José Fonseca [Mon, 4 Oct 2010 16:05:03 +0000 (17:05 +0100)]
retrace: Handle clear_render_target and clear_depth_stencil.
Dave Airlie [Tue, 5 Oct 2010 23:21:16 +0000 (09:21 +1000)]
r600g: add evergreen stencil support.
this sets the stencil up for evergreen properly.
Jerome Glisse [Tue, 5 Oct 2010 20:14:11 +0000 (16:14 -0400)]
r600g: userspace fence to avoid kernel call for testing bo busy status
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Brian Paul [Tue, 5 Oct 2010 20:32:59 +0000 (14:32 -0600)]
st/mesa: replace assertion w/ conditional in framebuffer invalidation
https://bugs.freedesktop.org/show_bug.cgi?id=30632
NOTE: this is a candidate for the 7.9 branch.
Jerome Glisse [Tue, 5 Oct 2010 19:23:07 +0000 (15:23 -0400)]
r600g: simplify block relocation
Since flush rework there could be only one relocation per
register in a block.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Bas Nieuwenhuizen [Tue, 5 Oct 2010 19:01:43 +0000 (21:01 +0200)]
r600g: use dirty list to track dirty blocks
Got a speed up by tracking the dirty blocks in a seperate list instead of looping through all blocks. This version should work with block that get their dirty state disabled again and I added a dirty check during the flush as some blocks were already dirty.
Ian Romanick [Tue, 5 Oct 2010 17:07:16 +0000 (10:07 -0700)]
docs: added news item for 7.9 release
Also fix link to release notes in 7.9-rc1 news item.
Ian Romanick [Mon, 27 Sep 2010 17:17:11 +0000 (10:17 -0700)]
docs: Import news updates from 7.9 branch
Partially cherry-picked from commit
61653b488da76ee1ca4f77363e222d3b717dd865
Ian Romanick [Wed, 16 Jun 2010 21:28:08 +0000 (14:28 -0700)]
docs: Update mailing lines from sf.net to freedesktop.org
(cherry picked from commit
c19bc5de961fe5e1f8a17131bcfae3dbcccaca29)
Ian Romanick [Wed, 16 Jun 2010 21:24:46 +0000 (14:24 -0700)]
docs: download.html does not need to be updated for each release
(cherry picked from commit
41e371e351cc4c77b2b20a545af2dfa2dab253d7)
Ian Romanick [Tue, 5 Oct 2010 16:55:54 +0000 (09:55 -0700)]
docs: Import 7.8.x release notes from 7.8 branch.
Ian Romanick [Tue, 5 Oct 2010 16:54:09 +0000 (09:54 -0700)]
docs: Import 7.9 release notes from 7.9 branch.
Nicolas Kaiser [Tue, 5 Oct 2010 09:26:43 +0000 (11:26 +0200)]
nv50: fix always true conditional in shader optimization
Jerome Glisse [Tue, 5 Oct 2010 14:29:30 +0000 (10:29 -0400)]
r600g: improve bo flushing
Flush read cache before writting register. Track flushing inside
of a same cs and avoid reflushing same bo if not necessary. Allmost
properly force flush if bo rendered too and then use as a texture
in same cs (missing pipeline flush dunno if it's needed or not).
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Jerome Glisse [Tue, 5 Oct 2010 12:42:42 +0000 (08:42 -0400)]
r600g: store reloc information in bo structure
Allow fast lookup of relocation information & id which
was a CPU time consumming operation.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Dave Airlie [Tue, 5 Oct 2010 09:08:41 +0000 (19:08 +1000)]
pb: fix numDelayed accounting
we weren't decreasing when removing from the list.
Dave Airlie [Tue, 5 Oct 2010 06:00:48 +0000 (16:00 +1000)]
r600g: avoid unneeded bo wait
if we know the bo has gone not busy, no need to add another bo wait
thanks to Andre (taiu) on irc for pointing this out.
Dave Airlie [Tue, 5 Oct 2010 06:00:23 +0000 (16:00 +1000)]
r600g: drop use_mem_constant.
since we plan on using dx10 constant buffers everywhere.
Dave Airlie [Tue, 5 Oct 2010 05:57:57 +0000 (15:57 +1000)]
r600g: drop mman allocator
we don't use this since constant buffers are now being used on all gpus.
Dave Airlie [Tue, 5 Oct 2010 05:51:38 +0000 (15:51 +1000)]
r600g: add bo busy backoff.
When we go to do a lot of bos in one draw like constant bufs we need
to avoid bouncing off the busy ioctl, this mitigates by backing off
on busy bos for a short amount of times.
Dave Airlie [Tue, 5 Oct 2010 05:50:58 +0000 (15:50 +1000)]
pb: don't keep checking buffers after first busy
If we assume busy buffers are added to the list in order its unlikely
we'd fine one after the first busy one that isn't busy.
Dave Airlie [Tue, 5 Oct 2010 05:35:52 +0000 (15:35 +1000)]
r600g: add bo fenced list.
this just keeps a list of bos submitted together, and uses them to decide
bo busy state for the whole group.
Brian Paul [Tue, 5 Oct 2010 01:59:23 +0000 (19:59 -0600)]
swrast: fix choose_depth_texture_level() to respect mipmap filtering state
NOTE: this is a candidate for the 7.9 branch.
Marek Olšák [Mon, 4 Oct 2010 19:19:27 +0000 (21:19 +0200)]
r300g: fix microtiling for 16-bits-per-channel formats
These texture formats (like R16G16B16A16_UNORM) were untested until now
because st/mesa doesn't use them. I am testing this with a hacked st/mesa
here.
Marek Olšák [Tue, 5 Oct 2010 00:56:14 +0000 (02:56 +0200)]
update release notes for Gallium
I am trying to be exhaustive, but still I might have missed tons of other
changes to Gallium.
(cherry picked from commit
968a9ec76eadf55e8b58171884e1175d7b8cf59a)
Conflicts:
docs/relnotes-7.9.html
Ian Romanick [Mon, 4 Oct 2010 23:35:09 +0000 (16:35 -0700)]
docs: Add list of bugs fixed in 7.9
Eric Anholt [Mon, 4 Oct 2010 22:07:17 +0000 (15:07 -0700)]
i965: Add support for gen6 FB writes to the new FS.
This uses message headers for now, since we'll need it for MRT. We
can cut out the header later.
Eric Anholt [Mon, 4 Oct 2010 22:03:32 +0000 (15:03 -0700)]
i965: In disasm, gen6 fb writes don't put msg reg # in destreg_conditionalmod.
It instead sensibly appears in the src0 slot.
Eric Anholt [Mon, 4 Oct 2010 18:48:04 +0000 (11:48 -0700)]
i965: Add initial folding of constants into operand immediate slots.
We could try to detect this in expression handling and do it
proactively there, but it seems like less logic to do it in one
optional pass at the end.
Eric Anholt [Sun, 3 Oct 2010 22:15:18 +0000 (15:15 -0700)]
i965: Add trivial dead code elimination in the new FS backend.
The glsl core should be handling most dead code issues for us, but we
generate some things in codegen that may not get used, like the 1/w
value or pixel deltas. It seems a lot easier this way than trying to
work out up front whether we're going to use those values or not.
Eric Anholt [Sun, 3 Oct 2010 22:01:20 +0000 (15:01 -0700)]
i965: Be more conservative on live interval calculation.
This also means that our intervals now highlight dead code.
Vinson Lee [Mon, 4 Oct 2010 22:56:55 +0000 (15:56 -0700)]
r600g: Fix SCons build.
Jerome Glisse [Mon, 4 Oct 2010 14:40:07 +0000 (10:40 -0400)]
r600g: remove dead label & fix indentation
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Jerome Glisse [Mon, 4 Oct 2010 14:38:50 +0000 (10:38 -0400)]
r600g: rename radeon_ws_bo to r600_bo
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Jerome Glisse [Mon, 4 Oct 2010 14:37:32 +0000 (10:37 -0400)]
r600g: use r600_bo for relocation argument, simplify code
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Jerome Glisse [Mon, 4 Oct 2010 14:25:23 +0000 (10:25 -0400)]
r600g: allow r600_bo to be a sub allocation of a big bo
Add bo offset everywhere needed if r600_bo is ever a sub bo
of a bigger bo.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Jerome Glisse [Mon, 4 Oct 2010 14:06:13 +0000 (10:06 -0400)]
r600g: rename radeon_ws_bo to r600_bo
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Krzysztof Smiechowicz [Mon, 4 Oct 2010 18:43:29 +0000 (11:43 -0700)]
nvfx: Pair os_malloc_aligned() with os_free_aligned().
From AROS.
Dave Airlie [Mon, 4 Oct 2010 06:41:49 +0000 (16:41 +1000)]
r600g: TODO domain management
no wonder it was slow, the code is deliberately forcing stuff into GTT,
we used to have domain management but it seems to have disappeared.
Dave Airlie [Mon, 4 Oct 2010 06:26:46 +0000 (16:26 +1000)]
r600g: fix wwarning in bo_map function
Dave Airlie [Mon, 4 Oct 2010 06:24:59 +0000 (16:24 +1000)]
r600g: the code to check whether a new vertex shader is needed was wrong
this code was memcmp'ing two structs, but refcounting one of them afterwards,
so any subsequent memcmp was never going to work.
again this stops unnecessary uploads of vertex program,
Dave Airlie [Mon, 4 Oct 2010 05:58:39 +0000 (15:58 +1000)]
r600g: break out of search for reloc bo after finding it.
this function was taking quite a lot of pointless CPU.
Eric Anholt [Sun, 3 Oct 2010 07:24:09 +0000 (00:24 -0700)]
i965: Fix glean/texSwizzle regression in previous commit.
Easy enough patch, who needs a full test run. Oh, that's right. Me.
Eric Anholt [Sun, 3 Oct 2010 06:27:31 +0000 (23:27 -0700)]
i965: Set up swizzling of shadow compare results for GL_DEPTH_TEXTURE_MODE.
The brw_wm_surface_state.c handling of GL_DEPTH_TEXTURE_MODE doesn't
apply to shadow compares, which always return an intensity value. The
texture swizzles can do the job for us.
Fixes:
glsl1-shadow2D(): 1
glsl1-shadow2D(): 3
Eric Anholt [Sun, 3 Oct 2010 06:44:29 +0000 (23:44 -0700)]
i965: Add support for EXT_texture_swizzle to the new FS backend.
Marek Olšák [Sat, 2 Oct 2010 21:13:12 +0000 (23:13 +0200)]
r300g: add support for L8A8 colorbuffers
Blending with DST_ALPHA is undefined. SRC_ALPHA works, though.
I bet some other formats have similar limitations too.
Marek Olšák [Sat, 2 Oct 2010 19:42:22 +0000 (21:42 +0200)]
r300g: add support for R8G8 colorbuffers
The hw swizzles have been obtained by a brute force approach,
and only C0 and C2 are stored in UV88, the other channels are
ignored.
R16G16 is going to be a lot trickier.
Dave Airlie [Wed, 11 Aug 2010 09:04:05 +0000 (19:04 +1000)]
mesa/st: initial attempt at RG support for gallium drivers
passes all piglit RG tests with softpipe.
Kenneth Graunke [Sat, 2 Oct 2010 02:53:24 +0000 (19:53 -0700)]
i965: Fix incorrect batchbuffer size in gen6 clip state command.
FORCE_ZERO_RTAINDEX should be in the fourth (and final) dword.
Eric Anholt [Sat, 2 Oct 2010 00:18:07 +0000 (17:18 -0700)]
i965: Don't try to emit code if we failed register allocation.