Rob Clark [Fri, 17 Oct 2014 12:57:16 +0000 (08:57 -0400)]
freedreno/a3xx: only emit dirty consts
If app only updates (for example) vertex uniforms, it would be nice to
only re-emit those and not also frag uniforms. Means we need to mark
the first frag shader const buffer dirty after a clear.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Wed, 15 Oct 2014 21:15:06 +0000 (17:15 -0400)]
freedreno/a3xx: more layer/level fixes
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Brian Paul [Mon, 20 Oct 2014 17:53:33 +0000 (11:53 -0600)]
mesa: fix 'feeedback' typo in comment
Trivial.
Brian Paul [Mon, 20 Oct 2014 17:49:17 +0000 (11:49 -0600)]
mesa: fix 'misalgned' typos in error messages
Trivial.
Brian Paul [Fri, 17 Oct 2014 19:31:53 +0000 (13:31 -0600)]
glsl: fix several use-after-free bugs
The get_variable_being_redeclared() function can free the 'var' argument.
Thereafter, we cannot assume that 'var' is a valid pointer. This patch
replaces 'var->name' with 'earlier->name' in two places and calls
is_gl_identifier(var->name) before 'var' might get freed.
This fixes several piglit GLSL crashes, including:
spec/glsl-1.50/execution/geometry/clip-distance-in-param
spec/glsl-1.50/execution/geometry/clip-distance-bulk-copy
spec/glsl-1.50/compiler/gs-redeclares-pervertex-out-before-global-redeclaration.geom
I'm not sure why these were not spotted sooner.
A similar bug was previously fixed by
f9cecca7a.
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Tapani Pälli [Tue, 14 Oct 2014 09:39:54 +0000 (12:39 +0300)]
mesa: validate sampler uniforms during gluniform calls
Patch fixes 'glsl-2types-of-textures-on-same-unit' in WebGL conformance
test suite. No Piglit regressions, fixes gl-2.0-active-sampler-conflict.
To avoid adding potentially heavy check during draw (valid_to_render),
check is done during uniform updates by inspecting TexturesUsed mask.
A new boolean variable is introduced to cache validation state.
v2: take into account case where 2 uniforms use same unit (curro)
also do the check only when SSO is not in use, SSO has own
path for sampler validation.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
EdB [Sat, 11 Oct 2014 16:01:36 +0000 (18:01 +0200)]
clover: Don't return CL_INVALID_VALUE if there is no header.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
EdB [Sat, 11 Oct 2014 22:58:39 +0000 (01:58 +0300)]
clover: Add allow_empty_tag.
To allow empty objs() list checks.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
EdB [Mon, 20 Oct 2014 07:34:17 +0000 (10:34 +0300)]
clover: Add initial implementation of clCompileProgram for CL 1.2.
[ Francisco Jerez: General clean-up. ]
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
EdB [Wed, 8 Oct 2014 22:06:48 +0000 (01:06 +0300)]
clover: Add a simple compat::pair.
std::pair is not c++98/c++11 safe.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Francisco Jerez [Wed, 8 Oct 2014 22:02:19 +0000 (01:02 +0300)]
clover/util: Allow using key_equals with pair-like objects other than std::pair.
Francisco Jerez [Wed, 8 Oct 2014 22:06:21 +0000 (01:06 +0300)]
clover/util: Define equality operators for a couple of compat classes.
Francisco Jerez [Wed, 8 Oct 2014 17:01:26 +0000 (20:01 +0300)]
clover/util: Fix construction of compat::vector with a general container as argument.
Tapani Pälli [Wed, 6 Aug 2014 06:46:54 +0000 (09:46 +0300)]
glsl: implement switch flow control using a loop
Patch removes old variable based logic for handling a break inside
switch. Switch is put inside a loop so that existing infrastructure
for loop flow control can be used for the switch, now also dead code
elimination works properly.
Possible 'continue' call inside a switch needs now special handling
which is taken care of by detecting continue, breaking out and calling
continue for the outside loop.
v2: remove one unnecessary ir_expression (Curro)
Fixes following Piglit tests:
fs-exec-after-break.shader_test
fs-conditional-break.shader_test
No Piglit or es3conform regressions.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Eric Anholt [Sat, 18 Oct 2014 11:50:05 +0000 (12:50 +0100)]
vc4: Translate 4-byte index buffers to 2 bytes.
Fixes assertion failures in 14 piglit tests (half of which now pass).
Eric Anholt [Fri, 3 Oct 2014 05:14:03 +0000 (22:14 -0700)]
vc4: Add support for rebasing texture levels so firstlevel == 0.
GLES2 doesn't have GL_TEXTURE_BASE_LEVEL, so the hardware doesn't. Fixes
piglit levelclamp, tex-miplevel-selection, and texture-storage/2D mipmap
rendering.
Eric Anholt [Fri, 17 Oct 2014 14:28:02 +0000 (15:28 +0100)]
vc4: Apply a Newton-Raphson step to improve RSQ
Fixes all the piglit built-in-functions/*sqrt tests, among others.
Eric Anholt [Fri, 17 Oct 2014 13:01:15 +0000 (14:01 +0100)]
vc4: Apply a Newton-Raphson step to improve RCP.
Fixes all the piglit floating-point *-op-div tests, among others.
Eric Anholt [Fri, 17 Oct 2014 14:04:27 +0000 (15:04 +0100)]
vc4: Add a little bit more packet parsing to make dump reading easier.
Probably should have done this *before* staring at all those render lists
today.
Chris Forbes [Sat, 11 Oct 2014 05:19:17 +0000 (18:19 +1300)]
meta/msaa-blit: consider weird sample count case unreachable
Suppresses a bunch of warning noise about sample_map possibly being used
uninitialized.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Jason Ekstrand [Thu, 16 Oct 2014 19:16:08 +0000 (12:16 -0700)]
i965/fs: Change the type of booleans to UD and emit correct immediates
Before, we used the a signed d-word for booleans and the immedates we
emitted varried between signed and unsigned. This commit changes the type
to unsigned (I think that makes more sense) and makes immediates more
consistent. This allows copy propagation to work better cleans up some
instructions.
total instructions in shared programs:
5473519 ->
5465864 (-0.14%)
instructions in affected programs: 432849 -> 425194 (-1.77%)
GAINED: 27
LOST: 0
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Fri, 17 Oct 2014 19:59:18 +0000 (12:59 -0700)]
i965/fs: Don't pass ir_variable * to emit_sampleid_setup().
gl_SampleID is a built-in variable that always is of type "int".
Suggested by Connor Abbott.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Eric Anholt [Fri, 17 Oct 2014 08:43:54 +0000 (09:43 +0100)]
vc4: Make some assertions about how many flushes/EOFs the simulator sees.
This caught the previous commit's bug in the kernel validator.
Eric Anholt [Fri, 17 Oct 2014 11:14:11 +0000 (12:14 +0100)]
vc4: Fix accidental dropping of the low bits of the store tilebuffer packet.
Notably this included the EOF flag (the other bits are the full buffer
dump selection, but we don't do full dumps), which caused the kernel
checking for frame completion to trigger.
Eric Anholt [Thu, 16 Oct 2014 09:17:57 +0000 (10:17 +0100)]
vc4: Set the primitive list format at the start of rendering.
The other driver does this manually before calling into each tile, but we
can just let it get binned into the tiles (saving repeated kernel
validation on the packet).
Fixes simulator assertion failures on polygon-mode and non-auto texwrap.
Eric Anholt [Fri, 17 Oct 2014 08:42:35 +0000 (09:42 +0100)]
vc4: Replace the FLUSH_ALL with FLUSH.
We don't need to emit all of our current state at the end of each bin
list. We're going to be smashing it all at the start of the next tile's
bin list, anyway.
Eric Anholt [Fri, 17 Oct 2014 08:40:12 +0000 (09:40 +0100)]
vc4: Add some comments about state management.
Eric Anholt [Thu, 16 Oct 2014 09:42:04 +0000 (10:42 +0100)]
vc4: Make sure there's exactly 1 tile store per tile coords packet.
It's not documented that I can see, but the other driver does it (check
vg_hw_4.c), and one of the HW guys confirmed that you really do need to do
it.
Michel Dänzer [Thu, 16 Oct 2014 06:10:20 +0000 (15:10 +0900)]
winsys/radeon: Use a single buffer cache manager again
The trick is to generate a unique buffer usage value for each possible
combination of domains and flags, with only one bit set each for the
domains and flags. This ensures pb_check_usage() only returns TRUE when
the domains and flags the cached buffer was created for exactly match
the requested ones.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tom Stellard [Tue, 30 Sep 2014 14:32:33 +0000 (10:32 -0400)]
clover: Add environment variables for dumping kernel code v2
There are two debug variables:
CLOVER_DEBUG which you can set to any combination of llvm,clc,asm
(separated by commas) to dump llvm IR, OpenCL C, and native assembly.
CLOVER_DEBUG_FILE which you can set to a file name for dumping output
instead of stderr. If you set this variable, the output will be split
into three separate files with different suffixes: .cl for OpenCL C,
.ll for LLVM IR, and .asm for native assembly. Note that when data
is written, it is always appended to the files.
v2:
- Code cleanups
- Add CLOVER_DEBUG_FILE environment variable for dumping to a file.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Tom Stellard [Fri, 26 Sep 2014 01:08:20 +0000 (18:08 -0700)]
clover: Register an llvm diagnostic handler v3
This will allow us to handle internal compiler errors.
v2:
- Code cleanups.
v3:
- More cleanups.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Tom Stellard [Thu, 25 Sep 2014 13:23:17 +0000 (09:23 -0400)]
clover: Add support for compiling to native object code v3
v2:
- Split build_module_native() into three separate functions.
- Code cleanups.
v3:
- More cleanups.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Tom Stellard [Thu, 25 Sep 2014 13:14:53 +0000 (09:14 -0400)]
gallium: Add PIPE_SHADER_IR_NATIVE to enum pipe_shader_ir
Drivers can return this value for PIPE_COMPUTE_CAP_IR_TARGET
if they want clover to give them native object code.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Tom Stellard [Thu, 25 Sep 2014 13:04:25 +0000 (09:04 -0400)]
clover: Factor kernel argument parsing into its own function v2
v2:
- Code cleanups.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Marek Olšák [Thu, 16 Oct 2014 21:19:59 +0000 (23:19 +0200)]
st/mesa: use pipe_sampler_view_release for releasing sampler views
This fixes a crash when exiting Firefox. I have really no idea how Firefox
does it. It seems to involve multiple contexts and multithreading.
v2: added an XXX comment
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81680
Acked by Christian König.
Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
Tested-by: Benjamin Bellec <b.bellec@gmail.com>
Kenneth Graunke [Wed, 15 Oct 2014 20:00:39 +0000 (13:00 -0700)]
mesa: Drop the "target" parameter from NewBufferObject().
NewBufferObject took a "target" parameter, which it blindly passed to
_mesa_initialize_buffer_object(), which ignored it.
Not much point in passing it around.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Andres Gomez [Thu, 16 Oct 2014 08:40:39 +0000 (11:40 +0300)]
glsl: Update and fix typos in README.
Chris Forbes [Sat, 11 Oct 2014 23:28:43 +0000 (12:28 +1300)]
i965: Flag BRW_ATOMIC_COUNTER_BUFFER when a possible ABO is respecified
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Chris Forbes [Sat, 11 Oct 2014 23:27:31 +0000 (12:27 +1300)]
mesa: Mark buffer objects that are used as atomic counter buffers
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Chris Forbes [Tue, 23 Sep 2014 10:16:23 +0000 (22:16 +1200)]
i965/disasm: Add missing message type for Gen7 DP untyped surface read
This is used to implement GLSL's atomicCounter() intrinsic. Previously
it *worked*, but the disassembly was bogus.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Chris Forbes [Tue, 23 Sep 2014 10:16:21 +0000 (22:16 +1200)]
i965: Correctly use ABO count to trigger flagging of new surfaces.
This would have *almost never* actually been an issue, since other state
tends to get flagged at the same time as new ABOs -- but still bogus.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Chris Forbes [Wed, 1 Oct 2014 07:38:43 +0000 (20:38 +1300)]
i965: No longer reemit textures on BRW_NEW_UNIFORM_BUFFER
This didn't make any sense, but papered over the missing TexBO flagging
we've just fixed, in a bunch of cases.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Chris Forbes [Wed, 1 Oct 2014 06:29:25 +0000 (19:29 +1300)]
i965: Dirty state in BO reallocation based on usage history
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Chris Forbes [Wed, 1 Oct 2014 08:31:45 +0000 (21:31 +1300)]
i965: Have mesa flag BRW_NEW_TEXTURE_BUFFER when a TexBO binding changes
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Chris Forbes [Wed, 1 Oct 2014 07:09:17 +0000 (20:09 +1300)]
i965: Add new dirty flag for new TexBOs.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Chris Forbes [Wed, 1 Oct 2014 07:04:37 +0000 (20:04 +1300)]
mesa: Mark buffer objects that are used as TexBOs
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Chris Forbes [Wed, 1 Oct 2014 06:27:11 +0000 (19:27 +1300)]
mesa: Mark buffer objects which are bound as UBOs
When a buffer object is bound to one of the indexed uniform buffer
binding points, assume that from that point on it may be used as
a uniform buffer.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Chris Forbes [Wed, 1 Oct 2014 06:19:47 +0000 (19:19 +1300)]
mesa: Add usage history bitfield to buffer objects
In the drivers, we occasionally want to reallocate the backing
store for a buffer object; often to avoid waiting for the GPU
to be finished with the previous contents.
At the point that happens, we don't have a good way of determining
where else the buffer object may be bound, and so no good way of
determining which dirty flags need to be raised -- it's fairly
expensive to go looking at all the possible binding points.
Until now, we've considered any BO to be possibly bound as a UBO or
TexBO, and flagged all that state to be reemitted.
Instead, remember what kinds of binding point this buffer has ever
been used with, so that the drivers can flag only what they need.
I don't expect these bits to ever be reset, but that doesn't matter
for reasonable apps.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Emil Velikov [Tue, 14 Oct 2014 15:10:50 +0000 (16:10 +0100)]
vc4: correctly include the source files
The kernel files are built into a separate static library and
all the functions that require it are already wrapped in ifdef
USE_VC4_SIMULATOR. Don't forget the header file :)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Connor Abbott [Mon, 4 Aug 2014 20:49:34 +0000 (13:49 -0700)]
i965/fs: don't make a fake ir_texture in the Mesa IR frontend
Now that we've made all the texture emit code mostly independent of GLSL
IR, this isn't necessary any more.
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Kenneth Graunke [Fri, 10 Oct 2014 09:41:20 +0000 (11:41 +0200)]
i965/fs: Refactor the texture emission logic into a single function.
Before, we had 3 different emit functions for various different gen's,
as well as some ancilliary work that was the same across all gen's which
was either contained in functions or duplicated across the GLSL IR and
Mesa IR backends. Now, we have a single method, emit_texture(), that
takes all the information needed to make a texture instruction and
handles all the setup, and all we have to do to emit a texture
instruction while converting from GLSL IR, Mesa IR, or any new backend
is to extract the information emit_texture() needs and then call it.
v2: Significant rebasing (by Ken).
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Connor Abbott [Sat, 2 Aug 2014 01:08:08 +0000 (18:08 -0700)]
i965/fs: Make gather_channel() not use ir_texture.
Our new IR won't have ir_texture objects.
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Connor Abbott [Sat, 2 Aug 2014 01:05:37 +0000 (18:05 -0700)]
i965/fs: Make swizzle_result() not use ir_texture.
Our new IR won't have ir_texture objects.
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Connor Abbott [Fri, 15 Aug 2014 17:22:20 +0000 (10:22 -0700)]
i965/fs: fix integer textures with swizzles
This happened to work before, but it would convert the output to a float
and then back to an integer which seems bad.
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Connor Abbott [Fri, 1 Aug 2014 23:47:58 +0000 (16:47 -0700)]
i965/fs: don't pass in ir_texture to emit_texture_*
At this point, the only thing it's used for is the opcode.
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Connor Abbott [Fri, 1 Aug 2014 23:30:26 +0000 (16:30 -0700)]
i965/fs: don't use ir->type in emit_texture_gen4()
We already have the type from the original destination.
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Connor Abbott [Fri, 1 Aug 2014 23:24:44 +0000 (16:24 -0700)]
i965/fs: Don't use ir->lod_info.grad.dPd<x,y> in emit_texture_*.
This drops a dependency on ir_texture objects.
v2 (Ken): Rename lod_components to grad_components, as it only has a
meaningful value for ir_txd. We could set it to 1 for TXL,
but there's no real need.
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Connor Abbott [Fri, 1 Aug 2014 22:46:11 +0000 (15:46 -0700)]
i965/fs: Don't use ir->coordinate in emit_texture_*.
This drops a dependency on ir_texture objects.
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Connor Abbott [Fri, 1 Aug 2014 22:03:03 +0000 (15:03 -0700)]
i965/fs: make rescale_texcoord() not use ir_texture.
Our new IR won't have ir_texture objects, but using glsl_type is fine.
v2 (Ken): Drop redundant ir->coordinate NULL check; rebase.
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Connor Abbott [Fri, 1 Aug 2014 21:46:31 +0000 (14:46 -0700)]
i965/fs: Make emit_mcs_fetch() not use ir_texture.
Our new IR won't have ir_texture objects.
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Kenneth Graunke [Mon, 1 Sep 2014 09:17:41 +0000 (02:17 -0700)]
i965/fs: Rename "length" to "components" in emit_mcs_fetch().
This is slightly clearer. Based on a patch by Connor Abbott.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Connor Abbott [Mon, 4 Aug 2014 22:20:38 +0000 (15:20 -0700)]
i965: Make brw_texture_offset() not use ir_texture.
Our new IR won't have ir_texture objects.
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Connor Abbott [Fri, 1 Aug 2014 21:13:31 +0000 (14:13 -0700)]
i965/fs: don't use ir->offset in emit_texture_gen5.
v2 (Ken): Refactor the Gen7 code separately; rebase.
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Kenneth Graunke [Mon, 1 Sep 2014 08:58:06 +0000 (01:58 -0700)]
i965/fs: Move texel offset handling to visit(ir_texture *).
This moves the handling of non-constant texel offset subexpression trees
to the place where we visit other such subtrees. It also removes some
uses of ir->offset in emit_texture_gen7, which will be useful when we
write the backend for our new upcoming IR.
Based on a patch by Connor Abbott.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Kenneth Graunke [Mon, 1 Sep 2014 08:39:14 +0000 (01:39 -0700)]
i965: Drop ir->op != ir_txf condition in offset checking.
brw_lower_unnormalized_offset sets ir->offset to NULL if it applies the
texelFetchOffset workarounds, so there's no need to special case it
here---there won't be an offset for ir_txf.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Kenneth Graunke [Mon, 1 Sep 2014 08:36:43 +0000 (01:36 -0700)]
i965: Restore a lost comment about TXF offset bugs.
Eric's original code to work around TXF offset bugs contained a comment
explaining the problem, which was lost when Chris generalized it to an
IR transformation (in commit
598ca510b8a118c3c7e18b5d031a2b116120e0a6).
This commit adds the original comment to the newer code.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Rob Clark [Wed, 15 Oct 2014 17:08:00 +0000 (13:08 -0400)]
freedreno/ir3: large const support
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Wed, 15 Oct 2014 18:38:07 +0000 (14:38 -0400)]
freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Wed, 15 Oct 2014 14:29:17 +0000 (10:29 -0400)]
freedreno: fix layer_stride
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Wed, 15 Oct 2014 12:12:24 +0000 (08:12 -0400)]
freedreno: inline fd_draw_emit()
Manual LTO
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Tue, 14 Oct 2014 20:23:18 +0000 (16:23 -0400)]
freedreno/ir3: optimize shader key comparision
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Tue, 14 Oct 2014 18:27:47 +0000 (14:27 -0400)]
freedreno/a3xx: refactor/optimize emit
Because we reuse various bits of emit code (for state/vertex/prog/etc)
for both regular draws and internal draws (gmem<->mem, clear, etc), the
number of parameters getting passed around has been growing. Refactor
to group these into fd3_emit. This simplifies fxn signatures, avoids
passing around shader key on the stack, etc. It also gives us a nice
place to cache shader-variant lookup to avoid looking up shader variants
multiple times per draw (without having to *also* pass them around as
fxn args everywhere).
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Tue, 14 Oct 2014 16:20:54 +0000 (12:20 -0400)]
freedreno/a3xx: refactor vertex state emit
Get rid of fd3_vertex_buf and use fd_vertex_state directly for all
draws. Removes a tiny bit of CPU overhead for munging around the vertex
state every time it is emitted, but more importantly it cleans things up
for later optimizations, so the emit paths don't have to special case
internal draws (gmem<->mem, clears, etc) with regular draws.
Instead of constructing fd3_vertex_buf array each time for internal
draws, and context init time pre-create solid_vbuf_state and
blit_vbuf_state.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Eric Anholt [Wed, 15 Oct 2014 15:16:09 +0000 (16:16 +0100)]
vc4: Fix the uniform debug output.
I dropped the shader index when moving to the compiled shader struct, but
didn't update the format string here.
Eric Anholt [Wed, 15 Oct 2014 14:25:57 +0000 (15:25 +0100)]
vc4: Add support for user clip plane and gl_ClipVertex.
Fixes about 15 piglit tests about interpolation and clipping.
Eric Anholt [Wed, 15 Oct 2014 15:39:54 +0000 (16:39 +0100)]
vc4: Move the output semantics setup to a helper.
I want to reuse it elsewhere to set up outputs that aren't in the TGSI.
Kenneth Graunke [Tue, 14 Oct 2014 06:45:07 +0000 (23:45 -0700)]
i965: Allow CSE on Gen4-5 unary math.
Due to the implicit move-from-GRF, unary math looks a lot like the Gen6+
math instruction: it's a single instruction (SEND) with a GRF source.
The difference is that it also implicitly clobbers a message register.
The only visible effect is that CSE will remove the MRF-clobbering from
later math operations. This should be fine; compute_to_mrf and
remove_redundant_mrf_writes don't look at the values populated by
implied writes, so they can't rely on those values being present.
Less interference may actually help those passes make more progress.
Binary math is still problematic, since it involves a separate MOV
instruction to load the second operand. We continue disallowing CSE for
binary math operations.
total instructions in shared programs:
3340303 ->
3340100 (-0.01%)
instructions in affected programs: 26927 -> 26724 (-0.75%)
Nothing hurt, gained, or lost. ~6% reduction on a few shaders.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Michel Dänzer [Wed, 8 Oct 2014 07:05:36 +0000 (16:05 +0900)]
r600g,radeonsi: Only set use_staging_texture = TRUE once
No need to check for setting the flag after we set it already.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Michel Dänzer [Wed, 8 Oct 2014 07:01:47 +0000 (16:01 +0900)]
r600g,radeonsi: Use staging texture for transfers if any miplevel is tiled
We set the NO_CPU_ACCESS flag for BO allocation in that case, so direct CPU
access may not work.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Michel Dänzer [Wed, 8 Oct 2014 07:34:46 +0000 (16:34 +0900)]
winsys/radeon: Use separate caching buffer manager for each set of flags
Otherwise the caching buffer manager may return a buffer which was created
with a different set of flags, which can cause trouble.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Andres Gomez [Tue, 7 Oct 2014 14:32:17 +0000 (17:32 +0300)]
configure.ac: check for libexpat when no pkg-config is available
Previously, when no pkg-config was available for
libexpat we would just add the needed linking
flags without any extra check.
Now, we check that the library and the headers are
also installed in the building environment.
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Tom Stellard [Tue, 14 Oct 2014 21:55:23 +0000 (17:55 -0400)]
clover: Fix regression in module serialization
We need to serialize semantic information for arguments, which was added
in
06139c56fa070f84a931a4ddbdb894c9e8d24f55.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Jason Ekstrand [Tue, 14 Oct 2014 19:02:19 +0000 (12:02 -0700)]
i965/fs: Use the correct regs_written on unspill instructions
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Ilia Mirkin [Tue, 14 Oct 2014 02:39:48 +0000 (22:39 -0400)]
st/gbm: fix order of arguments passed to is_format_supported
Reported by Coverity
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Cc: mesa-stable@lists.freedesktop.org
Ilia Mirkin [Sun, 5 Oct 2014 16:35:51 +0000 (12:35 -0400)]
nouveau: 3d textures are unsupported, limit 3d levels to 1
Ideally there would be a swrast fallback, but the driver isn't ready for
that. This should avoid crashes if someone tries to use 3d textures
though.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Cc: mesa-stable@lists.freedesktop.org
Rob Clark [Wed, 1 Oct 2014 00:09:11 +0000 (20:09 -0400)]
freedreno: use tgsi_lowering
Now that the freedreno_lowering code is moved to tgsi_lowering, remove
our private copy and switch over to using the common version.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
David Heidelberger [Tue, 14 Oct 2014 00:25:01 +0000 (02:25 +0200)]
r300/compiler: remove useless check
This code is already in if (!variable->C->is_r500) so no need check
twice.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: David Heidelberger <david.heidelberger@ixit.cz>
Nick Sarnie [Fri, 12 Sep 2014 22:20:46 +0000 (18:20 -0400)]
ilo: Build pipe-loader for ilo
Trivial patch to create the pipe loader for ilo. All the code was already there.
Signed-off-by: Nick Sarnie <commendsarnex@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Tue, 14 Oct 2014 14:25:54 +0000 (15:25 +0100)]
automake: explicitly set TARGET_RADEON_{WINSYS,COMMON}
Originally the variables were set only once via the ?= operator but
that causes issues when doing incremental builds. They appear to be
undefined and missing from the dependency list despite their addition
to LIBADD.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84807
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Eric Anholt [Tue, 14 Oct 2014 13:28:14 +0000 (14:28 +0100)]
vc4: Fix render target NPOT alignment at small miplevels.
The texturing hardware takes the POT level 0 width/height and minifies
those. This is different from what we were doing, for example, for
273-wide's level 5: POT(273>>5) == 8, while POT(273)>>5 == 16.
Fixes piglit-depthstencil-render-miplevels 273.
Eric Anholt [Thu, 25 Sep 2014 21:57:01 +0000 (14:57 -0700)]
vc4: Add support for having 0 vertex elements used.
You have to load at least 1, according to the simulator. Fixes 4 piglit
tests and even more ES2 conformance tests.
Vinson Lee [Sat, 11 Oct 2014 05:40:21 +0000 (22:40 -0700)]
auxilary/os: Add DragonFly BSD support in os_get_total_physical_memory.
This patch fixes this build error on DragonFly BSD.
CC os/os_misc.lo
os/os_misc.c: In function 'os_get_total_physical_memory':
os/os_misc.c:132:2: error: #error Unsupported *BSD
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Daniel Manjarres [Sun, 22 Jun 2014 16:47:58 +0000 (09:47 -0700)]
glx: Fix glxUseXFont for glxWindow and glxPixmaps
The current implementation of glxUseXFont requires creating
a temporary pixmap and graphics context, which requires a real
old-school X11 Window, not a glxDrawable. This patch changes
things so that glxUseXFont will also accept a glxWindow or
glxPixmap, and lookup the underlying X11 Drawable. Without
this patch glxUseXFont generates a giant stream of Xerrors
about bad drawables and bad graphics contexts.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54372
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
Chia-I Wu [Tue, 14 Oct 2014 00:29:16 +0000 (08:29 +0800)]
ilo: clear writer pointer after unmapping
It does not look like an issue now but it is good to be future proof. Spotted
by Courtney Goeltzenleuchter.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Eric Anholt [Mon, 13 Oct 2014 15:20:01 +0000 (16:20 +0100)]
vc4: Write the VPM read setup multiple times to queue all the inputs.
There's a 4-element fifo, and the size (number of dwords per vertex) field
is just 4 bits.
Fixes glsl-routing on sim.
Eric Anholt [Mon, 13 Oct 2014 13:38:10 +0000 (14:38 +0100)]
vc4: Add support for the TXL opcode.
There's a bit at the bottom of cube map stride (which has some formatting
bugs in the docs) which flips the bias coordinate to being an absolute
LOD.
Eric Anholt [Mon, 13 Oct 2014 13:11:28 +0000 (14:11 +0100)]
vc4: Improve the accuracy of SIN and COS.
This gets them to pass glsl-sin/cos. There was an obvious problem that I
was using the FRC code on the scaled input value, which means that we had
a range in [0, 1], while our taylor is most accurate across [-0.5, 0.5].
We can just slide things over, but that means flipping the sign of the
coefficients. After that, it was just a matter of stuffing more
coefficients in.
Kenneth Graunke [Thu, 21 Aug 2014 21:41:17 +0000 (14:41 -0700)]
i965: Use unsynchronized maps for the program cache on LLC platforms.
There's no reason to stall on pwrite - the CPU always appends to the
buffer and never modifies existing contents, and the GPU never writes
it. Further, the CPU always appends new data before submitting a batch
that requires it.
This code predates the unsynchronized mapping feature, so we simply
didn't have the option when it was written.
Ideally, we would do this for non-LLC platforms too, but unsynchronized
mapping support only exists for LLC systems.
Saves a bunch of stall avoidance copies when uploading shaders.
v2: Rebase on changes to previous patch.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> [v1]
Kenneth Graunke [Thu, 21 Aug 2014 17:50:31 +0000 (10:50 -0700)]
i965: Issue performance warnings when copying the program cache BO.
We don't really want unnecessary buffer copying, so it'd be nice to know
when it's happening.
v2: Drop stall warnings when doing a read-only CPU mapping of the cache
BO. The GPU also uses it in a read-only fashion, so there won't be
any stalls, even though the buffer is busy. (Thanks to Chris Wilson
for catching this mistake.)
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> [v1]
Kenneth Graunke [Thu, 21 Aug 2014 17:42:05 +0000 (10:42 -0700)]
i965: Issue performance warnings on MapBufferRange stalls.
This is easy: we just need to use brw_map_bo instead of mapping it
directly.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>