mesa.git
10 years agor600g,radeonsi: convert TGSI shader type to LLVM shader type
Marek Olšák [Tue, 23 Sep 2014 15:17:01 +0000 (17:17 +0200)]
r600g,radeonsi: convert TGSI shader type to LLVM shader type

The values are hardcoded in the LLVM backend, but the TGSI definitions are
going to be changed with tessellation, e.g. TGSI_PROCESSOR_COMPUTE will be
increased by 2.

We'll use VS for LS and HS, because there's nothing special about them
from the LLVM backend point of view, even though the hardware side is
different. We do the same for ES.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agoradeonsi: add some missing register definitions
Marek Olšák [Sun, 5 Oct 2014 22:19:31 +0000 (00:19 +0200)]
radeonsi: add some missing register definitions

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agoradeonsi: load ring resource descriptors only once
Marek Olšák [Sun, 5 Oct 2014 11:33:40 +0000 (13:33 +0200)]
radeonsi: load ring resource descriptors only once

v2: document the new functions

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agoradeonsi: clarify shader constant load functions
Marek Olšák [Fri, 26 Sep 2014 21:06:32 +0000 (23:06 +0200)]
radeonsi: clarify shader constant load functions

I'll need indexed loads without the meta data flag for tessellation later.
Also rename load_const to buffer_load_const to distinguish it from indexed
const loads.

v2: add comments

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agoradeonsi: statically declare resource and sampler arrays
Marek Olšák [Sun, 5 Oct 2014 10:38:54 +0000 (12:38 +0200)]
radeonsi: statically declare resource and sampler arrays

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agoradeonsi: remove conversion of DX9 FACE input to GL
Marek Olšák [Thu, 16 Oct 2014 14:20:26 +0000 (16:20 +0200)]
radeonsi: remove conversion of DX9 FACE input to GL

st/mesa and gallium expect the DX9 format, so this is useless.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agoradeonsi: revert hack for random failures in glsl-max-varyings
Marek Olšák [Tue, 14 Oct 2014 20:51:10 +0000 (22:51 +0200)]
radeonsi: revert hack for random failures in glsl-max-varyings

This reverts commit 032e5548b3d4b5efa52359218725cb8e31b622ad.

I've run glsl-max-varyings 30 times and it always passed.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agoradeonsi: generate shader pm4 states right after shader compilation
Marek Olšák [Tue, 14 Oct 2014 15:48:52 +0000 (17:48 +0200)]
radeonsi: generate shader pm4 states right after shader compilation

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agoradeonsi: make pm4 state generation for shaders independent of the context
Marek Olšák [Tue, 14 Oct 2014 15:36:30 +0000 (17:36 +0200)]
radeonsi: make pm4 state generation for shaders independent of the context

The si_pm4_delete_state calls became useless, because the pm4 state is
always generated only once.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agoradeonsi: inline si_pm4_alloc_state
Marek Olšák [Tue, 14 Oct 2014 15:31:00 +0000 (17:31 +0200)]
radeonsi: inline si_pm4_alloc_state

It seemed like the function needed a context pointer. Let's remove it
to make it less confusing.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agor300g: replace r300_get_num_samples with a util variant
Marek Olšák [Mon, 20 Oct 2014 13:41:42 +0000 (15:41 +0200)]
r300g: replace r300_get_num_samples with a util variant

10 years agoglsl_to_tgsi: use _mesa_copy_linked_program_data
Marek Olšák [Mon, 6 Oct 2014 19:12:14 +0000 (21:12 +0200)]
glsl_to_tgsi: use _mesa_copy_linked_program_data

This deduplicates some code.

10 years agoglsl_to_tgsi: fix the value of gl_FrontFacing with native integers
Marek Olšák [Thu, 16 Oct 2014 14:21:54 +0000 (16:21 +0200)]
glsl_to_tgsi: fix the value of gl_FrontFacing with native integers

We must convert it to boolean from the DX9 float encoding that Gallium
specifies.

Later, we should probably define that FACE should be 0 or ~0 if native
integers are supported.

Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
10 years agost/mesa: add ST_DEBUG=wf option which enables wireframe rendering
Marek Olšák [Sun, 5 Oct 2014 16:55:47 +0000 (18:55 +0200)]
st/mesa: add ST_DEBUG=wf option which enables wireframe rendering

Useful for tessellation.

10 years agogallium: add PIPE_SHADER_CAP_MAX_OUTPUTS and use it in st/mesa
Marek Olšák [Wed, 1 Oct 2014 18:28:17 +0000 (20:28 +0200)]
gallium: add PIPE_SHADER_CAP_MAX_OUTPUTS and use it in st/mesa

With 5 shader stages and various combinations of enabled and disabled shaders,
the maximum number of outputs in one shader doesn't have to be equal to
the maximum number of inputs in the following shader.

v2: return 32 for softpipe and llvmpipe

10 years agovc4: Fix SRC_ALPHA_SATURATE blending.
Eric Anholt [Tue, 21 Oct 2014 14:46:48 +0000 (15:46 +0100)]
vc4: Fix SRC_ALPHA_SATURATE blending.

Fixes glean blendFunc.

10 years agovc4: Fix stencil writemask handling.
Eric Anholt [Mon, 20 Oct 2014 20:14:57 +0000 (21:14 +0100)]
vc4: Fix stencil writemask handling.

If the writemask doesn't compress, then we want to put in the uncompressed
writemask, not the compressed writemask failure value (all-on).

Fixes glean's stencil2 and fbo-clear-formats on stencil.

10 years agovc4: Don't look at back stencil state unless two-sided stencil is enabled.
Eric Anholt [Mon, 20 Oct 2014 21:53:07 +0000 (22:53 +0100)]
vc4: Don't look at back stencil state unless two-sided stencil is enabled.

Fixes regressions in the next bugfix, because gallium util stuff leaves
the back stencil state as 0 if !back->enabled.

10 years agofreedreno/ir3: add debug flag to disable cp
Rob Clark [Sun, 19 Oct 2014 18:55:32 +0000 (14:55 -0400)]
freedreno/ir3: add debug flag to disable cp

FD_MESA_DEBUG=nocp will disable copy propagation pass.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno: positions come out as integers, not half-integers
Ilia Mirkin [Fri, 3 Oct 2014 20:23:19 +0000 (16:23 -0400)]
freedreno: positions come out as integers, not half-integers

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx: disable early-z when we have kill's
Rob Clark [Sat, 18 Oct 2014 20:52:44 +0000 (16:52 -0400)]
freedreno/a3xx: disable early-z when we have kill's

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/ir3: fix potential gpu lockup with kill
Rob Clark [Sat, 18 Oct 2014 19:28:16 +0000 (15:28 -0400)]
freedreno/ir3: fix potential gpu lockup with kill

It seems like the hardware is unhappy if we execute a kill instruction
prior to last input (ei).  Probably the shader thread stops executing
and the end-input flag is never set.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/ir3: comment + better fxn name
Rob Clark [Sat, 18 Oct 2014 18:46:35 +0000 (14:46 -0400)]
freedreno/ir3: comment + better fxn name

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx: only emit dirty consts
Rob Clark [Fri, 17 Oct 2014 12:57:16 +0000 (08:57 -0400)]
freedreno/a3xx: only emit dirty consts

If app only updates (for example) vertex uniforms, it would be nice to
only re-emit those and not also frag uniforms.  Means we need to mark
the first frag shader const buffer dirty after a clear.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx: more layer/level fixes
Rob Clark [Wed, 15 Oct 2014 21:15:06 +0000 (17:15 -0400)]
freedreno/a3xx: more layer/level fixes

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agomesa: fix 'feeedback' typo in comment
Brian Paul [Mon, 20 Oct 2014 17:53:33 +0000 (11:53 -0600)]
mesa: fix 'feeedback' typo in comment

Trivial.

10 years agomesa: fix 'misalgned' typos in error messages
Brian Paul [Mon, 20 Oct 2014 17:49:17 +0000 (11:49 -0600)]
mesa: fix 'misalgned' typos in error messages

Trivial.

10 years agoglsl: fix several use-after-free bugs
Brian Paul [Fri, 17 Oct 2014 19:31:53 +0000 (13:31 -0600)]
glsl: fix several use-after-free bugs

The get_variable_being_redeclared() function can free the 'var' argument.
Thereafter, we cannot assume that 'var' is a valid  pointer.  This patch
replaces 'var->name' with 'earlier->name' in two places and calls
is_gl_identifier(var->name) before 'var' might get freed.

This fixes several piglit GLSL crashes, including:
spec/glsl-1.50/execution/geometry/clip-distance-in-param
spec/glsl-1.50/execution/geometry/clip-distance-bulk-copy
spec/glsl-1.50/compiler/gs-redeclares-pervertex-out-before-global-redeclaration.geom

I'm not sure why these were not spotted sooner.
A similar bug was previously fixed by f9cecca7a.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agomesa: validate sampler uniforms during gluniform calls
Tapani Pälli [Tue, 14 Oct 2014 09:39:54 +0000 (12:39 +0300)]
mesa: validate sampler uniforms during gluniform calls

Patch fixes 'glsl-2types-of-textures-on-same-unit' in WebGL conformance
test suite. No Piglit regressions, fixes gl-2.0-active-sampler-conflict.

To avoid adding potentially heavy check during draw (valid_to_render),
check is done during uniform updates by inspecting TexturesUsed mask.

A new boolean variable is introduced to cache validation state.

v2: take into account case where 2 uniforms use same unit (curro)
    also do the check only when SSO is not in use, SSO has own
    path for sampler validation.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
10 years agoclover: Don't return CL_INVALID_VALUE if there is no header.
EdB [Sat, 11 Oct 2014 16:01:36 +0000 (18:01 +0200)]
clover: Don't return CL_INVALID_VALUE if there is no header.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
10 years agoclover: Add allow_empty_tag.
EdB [Sat, 11 Oct 2014 22:58:39 +0000 (01:58 +0300)]
clover: Add allow_empty_tag.

To allow empty objs() list checks.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
10 years agoclover: Add initial implementation of clCompileProgram for CL 1.2.
EdB [Mon, 20 Oct 2014 07:34:17 +0000 (10:34 +0300)]
clover: Add initial implementation of clCompileProgram for CL 1.2.

[ Francisco Jerez: General clean-up. ]

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
10 years agoclover: Add a simple compat::pair.
EdB [Wed, 8 Oct 2014 22:06:48 +0000 (01:06 +0300)]
clover: Add a simple compat::pair.

std::pair is not c++98/c++11 safe.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
10 years agoclover/util: Allow using key_equals with pair-like objects other than std::pair.
Francisco Jerez [Wed, 8 Oct 2014 22:02:19 +0000 (01:02 +0300)]
clover/util: Allow using key_equals with pair-like objects other than std::pair.

10 years agoclover/util: Define equality operators for a couple of compat classes.
Francisco Jerez [Wed, 8 Oct 2014 22:06:21 +0000 (01:06 +0300)]
clover/util: Define equality operators for a couple of compat classes.

10 years agoclover/util: Fix construction of compat::vector with a general container as argument.
Francisco Jerez [Wed, 8 Oct 2014 17:01:26 +0000 (20:01 +0300)]
clover/util: Fix construction of compat::vector with a general container as argument.

10 years agoglsl: implement switch flow control using a loop
Tapani Pälli [Wed, 6 Aug 2014 06:46:54 +0000 (09:46 +0300)]
glsl: implement switch flow control using a loop

Patch removes old variable based logic for handling a break inside
switch. Switch is put inside a loop so that existing infrastructure
for loop flow control can be used for the switch, now also dead code
elimination works properly.

Possible 'continue' call inside a switch needs now special handling
which is taken care of by detecting continue, breaking out and calling
continue for the outside loop.

v2: remove one unnecessary ir_expression (Curro)

Fixes following Piglit tests:

   fs-exec-after-break.shader_test
   fs-conditional-break.shader_test

No Piglit or es3conform regressions.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
10 years agovc4: Translate 4-byte index buffers to 2 bytes.
Eric Anholt [Sat, 18 Oct 2014 11:50:05 +0000 (12:50 +0100)]
vc4: Translate 4-byte index buffers to 2 bytes.

Fixes assertion failures in 14 piglit tests (half of which now pass).

10 years agovc4: Add support for rebasing texture levels so firstlevel == 0.
Eric Anholt [Fri, 3 Oct 2014 05:14:03 +0000 (22:14 -0700)]
vc4: Add support for rebasing texture levels so firstlevel == 0.

GLES2 doesn't have GL_TEXTURE_BASE_LEVEL, so the hardware doesn't.  Fixes
piglit levelclamp, tex-miplevel-selection, and texture-storage/2D mipmap
rendering.

10 years agovc4: Apply a Newton-Raphson step to improve RSQ
Eric Anholt [Fri, 17 Oct 2014 14:28:02 +0000 (15:28 +0100)]
vc4: Apply a Newton-Raphson step to improve RSQ

Fixes all the piglit built-in-functions/*sqrt tests, among others.

10 years agovc4: Apply a Newton-Raphson step to improve RCP.
Eric Anholt [Fri, 17 Oct 2014 13:01:15 +0000 (14:01 +0100)]
vc4: Apply a Newton-Raphson step to improve RCP.

Fixes all the piglit floating-point *-op-div tests, among others.

10 years agovc4: Add a little bit more packet parsing to make dump reading easier.
Eric Anholt [Fri, 17 Oct 2014 14:04:27 +0000 (15:04 +0100)]
vc4: Add a little bit more packet parsing to make dump reading easier.

Probably should have done this *before* staring at all those render lists
today.

10 years agometa/msaa-blit: consider weird sample count case unreachable
Chris Forbes [Sat, 11 Oct 2014 05:19:17 +0000 (18:19 +1300)]
meta/msaa-blit: consider weird sample count case unreachable

Suppresses a bunch of warning noise about sample_map possibly being used
uninitialized.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
10 years agoi965/fs: Change the type of booleans to UD and emit correct immediates
Jason Ekstrand [Thu, 16 Oct 2014 19:16:08 +0000 (12:16 -0700)]
i965/fs: Change the type of booleans to UD and emit correct immediates

Before, we used the a signed d-word for booleans and the immedates we
emitted varried between signed and unsigned.  This commit changes the type
to unsigned (I think that makes more sense) and makes immediates more
consistent.  This allows copy propagation to work better cleans up some
instructions.

total instructions in shared programs: 5473519 -> 5465864 (-0.14%)
instructions in affected programs:     432849 -> 425194 (-1.77%)
GAINED:                                27
LOST:                                  0

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965/fs: Don't pass ir_variable * to emit_sampleid_setup().
Kenneth Graunke [Fri, 17 Oct 2014 19:59:18 +0000 (12:59 -0700)]
i965/fs: Don't pass ir_variable * to emit_sampleid_setup().

gl_SampleID is a built-in variable that always is of type "int".

Suggested by Connor Abbott.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
10 years agovc4: Make some assertions about how many flushes/EOFs the simulator sees.
Eric Anholt [Fri, 17 Oct 2014 08:43:54 +0000 (09:43 +0100)]
vc4: Make some assertions about how many flushes/EOFs the simulator sees.

This caught the previous commit's bug in the kernel validator.

10 years agovc4: Fix accidental dropping of the low bits of the store tilebuffer packet.
Eric Anholt [Fri, 17 Oct 2014 11:14:11 +0000 (12:14 +0100)]
vc4: Fix accidental dropping of the low bits of the store tilebuffer packet.

Notably this included the EOF flag (the other bits are the full buffer
dump selection, but we don't do full dumps), which caused the kernel
checking for frame completion to trigger.

10 years agovc4: Set the primitive list format at the start of rendering.
Eric Anholt [Thu, 16 Oct 2014 09:17:57 +0000 (10:17 +0100)]
vc4: Set the primitive list format at the start of rendering.

The other driver does this manually before calling into each tile, but we
can just let it get binned into the tiles (saving repeated kernel
validation on the packet).

Fixes simulator assertion failures on polygon-mode and non-auto texwrap.

10 years agovc4: Replace the FLUSH_ALL with FLUSH.
Eric Anholt [Fri, 17 Oct 2014 08:42:35 +0000 (09:42 +0100)]
vc4: Replace the FLUSH_ALL with FLUSH.

We don't need to emit all of our current state at the end of each bin
list.  We're going to be smashing it all at the start of the next tile's
bin list, anyway.

10 years agovc4: Add some comments about state management.
Eric Anholt [Fri, 17 Oct 2014 08:40:12 +0000 (09:40 +0100)]
vc4: Add some comments about state management.

10 years agovc4: Make sure there's exactly 1 tile store per tile coords packet.
Eric Anholt [Thu, 16 Oct 2014 09:42:04 +0000 (10:42 +0100)]
vc4: Make sure there's exactly 1 tile store per tile coords packet.

It's not documented that I can see, but the other driver does it (check
vg_hw_4.c), and one of the HW guys confirmed that you really do need to do
it.

10 years agowinsys/radeon: Use a single buffer cache manager again
Michel Dänzer [Thu, 16 Oct 2014 06:10:20 +0000 (15:10 +0900)]
winsys/radeon: Use a single buffer cache manager again

The trick is to generate a unique buffer usage value for each possible
combination of domains and flags, with only one bit set each for the
domains and flags. This ensures pb_check_usage() only returns TRUE when
the domains and flags the cached buffer was created for exactly match
the requested ones.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
10 years agoclover: Add environment variables for dumping kernel code v2
Tom Stellard [Tue, 30 Sep 2014 14:32:33 +0000 (10:32 -0400)]
clover: Add environment variables for dumping kernel code v2

There are two debug variables:

CLOVER_DEBUG which you can set to any combination of llvm,clc,asm
(separated by commas) to dump llvm IR, OpenCL C, and native assembly.

CLOVER_DEBUG_FILE which you can set to a file name for dumping output
instead of stderr.  If you set this variable, the output will be split
into three separate files with different suffixes: .cl for OpenCL C,
.ll for LLVM IR, and .asm for native assembly.  Note that when data
is written, it is always appended to the files.

v2:
  - Code cleanups
  - Add CLOVER_DEBUG_FILE environment variable for dumping to a file.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
10 years agoclover: Register an llvm diagnostic handler v3
Tom Stellard [Fri, 26 Sep 2014 01:08:20 +0000 (18:08 -0700)]
clover: Register an llvm diagnostic handler v3

This will allow us to handle internal compiler errors.

v2:
  - Code cleanups.

v3:
  - More cleanups.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
10 years agoclover: Add support for compiling to native object code v3
Tom Stellard [Thu, 25 Sep 2014 13:23:17 +0000 (09:23 -0400)]
clover: Add support for compiling to native object code v3

v2:
  - Split build_module_native() into three separate functions.
  - Code cleanups.

v3:
  - More cleanups.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
10 years agogallium: Add PIPE_SHADER_IR_NATIVE to enum pipe_shader_ir
Tom Stellard [Thu, 25 Sep 2014 13:14:53 +0000 (09:14 -0400)]
gallium: Add PIPE_SHADER_IR_NATIVE to enum pipe_shader_ir

Drivers can return this value for PIPE_COMPUTE_CAP_IR_TARGET
if they want clover to give them native object code.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
10 years agoclover: Factor kernel argument parsing into its own function v2
Tom Stellard [Thu, 25 Sep 2014 13:04:25 +0000 (09:04 -0400)]
clover: Factor kernel argument parsing into its own function v2

v2:
  - Code cleanups.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
10 years agost/mesa: use pipe_sampler_view_release for releasing sampler views
Marek Olšák [Thu, 16 Oct 2014 21:19:59 +0000 (23:19 +0200)]
st/mesa: use pipe_sampler_view_release for releasing sampler views

This fixes a crash when exiting Firefox. I have really no idea how Firefox
does it. It seems to involve multiple contexts and multithreading.

v2: added an XXX comment

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81680

Acked by Christian König.
Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
Tested-by: Benjamin Bellec <b.bellec@gmail.com>
10 years agomesa: Drop the "target" parameter from NewBufferObject().
Kenneth Graunke [Wed, 15 Oct 2014 20:00:39 +0000 (13:00 -0700)]
mesa: Drop the "target" parameter from NewBufferObject().

NewBufferObject took a "target" parameter, which it blindly passed to
_mesa_initialize_buffer_object(), which ignored it.

Not much point in passing it around.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl: Update and fix typos in README.
Andres Gomez [Thu, 16 Oct 2014 08:40:39 +0000 (11:40 +0300)]
glsl: Update and fix typos in README.

10 years agoi965: Flag BRW_ATOMIC_COUNTER_BUFFER when a possible ABO is respecified
Chris Forbes [Sat, 11 Oct 2014 23:28:43 +0000 (12:28 +1300)]
i965: Flag BRW_ATOMIC_COUNTER_BUFFER when a possible ABO is respecified

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
10 years agomesa: Mark buffer objects that are used as atomic counter buffers
Chris Forbes [Sat, 11 Oct 2014 23:27:31 +0000 (12:27 +1300)]
mesa: Mark buffer objects that are used as atomic counter buffers

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
10 years agoi965/disasm: Add missing message type for Gen7 DP untyped surface read
Chris Forbes [Tue, 23 Sep 2014 10:16:23 +0000 (22:16 +1200)]
i965/disasm: Add missing message type for Gen7 DP untyped surface read

This is used to implement GLSL's atomicCounter() intrinsic. Previously
it *worked*, but the disassembly was bogus.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
10 years agoi965: Correctly use ABO count to trigger flagging of new surfaces.
Chris Forbes [Tue, 23 Sep 2014 10:16:21 +0000 (22:16 +1200)]
i965: Correctly use ABO count to trigger flagging of new surfaces.

This would have *almost never* actually been an issue, since other state
tends to get flagged at the same time as new ABOs -- but still bogus.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
10 years agoi965: No longer reemit textures on BRW_NEW_UNIFORM_BUFFER
Chris Forbes [Wed, 1 Oct 2014 07:38:43 +0000 (20:38 +1300)]
i965: No longer reemit textures on BRW_NEW_UNIFORM_BUFFER

This didn't make any sense, but papered over the missing TexBO flagging
we've just fixed, in a bunch of cases.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
10 years agoi965: Dirty state in BO reallocation based on usage history
Chris Forbes [Wed, 1 Oct 2014 06:29:25 +0000 (19:29 +1300)]
i965: Dirty state in BO reallocation based on usage history

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
10 years agoi965: Have mesa flag BRW_NEW_TEXTURE_BUFFER when a TexBO binding changes
Chris Forbes [Wed, 1 Oct 2014 08:31:45 +0000 (21:31 +1300)]
i965: Have mesa flag BRW_NEW_TEXTURE_BUFFER when a TexBO binding changes

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
10 years agoi965: Add new dirty flag for new TexBOs.
Chris Forbes [Wed, 1 Oct 2014 07:09:17 +0000 (20:09 +1300)]
i965: Add new dirty flag for new TexBOs.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
10 years agomesa: Mark buffer objects that are used as TexBOs
Chris Forbes [Wed, 1 Oct 2014 07:04:37 +0000 (20:04 +1300)]
mesa: Mark buffer objects that are used as TexBOs

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
10 years agomesa: Mark buffer objects which are bound as UBOs
Chris Forbes [Wed, 1 Oct 2014 06:27:11 +0000 (19:27 +1300)]
mesa: Mark buffer objects which are bound as UBOs

When a buffer object is bound to one of the indexed uniform buffer
binding points, assume that from that point on it may be used as
a uniform buffer.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
10 years agomesa: Add usage history bitfield to buffer objects
Chris Forbes [Wed, 1 Oct 2014 06:19:47 +0000 (19:19 +1300)]
mesa: Add usage history bitfield to buffer objects

In the drivers, we occasionally want to reallocate the backing
store for a buffer object; often to avoid waiting for the GPU
to be finished with the previous contents.

At the point that happens, we don't have a good way of determining
where else the buffer object may be bound, and so no good way of
determining which dirty flags need to be raised -- it's fairly
expensive to go looking at all the possible binding points.

Until now, we've considered any BO to be possibly bound as a UBO or
TexBO, and flagged all that state to be reemitted.

Instead, remember what kinds of binding point this buffer has ever
been used with, so that the drivers can flag only what they need.
I don't expect these bits to ever be reset, but that doesn't matter
for reasonable apps.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
10 years agovc4: correctly include the source files
Emil Velikov [Tue, 14 Oct 2014 15:10:50 +0000 (16:10 +0100)]
vc4: correctly include the source files

The kernel files are built into a separate static library and
all the functions that require it are already wrapped in ifdef
USE_VC4_SIMULATOR. Don't forget the header file :)

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
10 years agoi965/fs: don't make a fake ir_texture in the Mesa IR frontend
Connor Abbott [Mon, 4 Aug 2014 20:49:34 +0000 (13:49 -0700)]
i965/fs: don't make a fake ir_texture in the Mesa IR frontend

Now that we've made all the texture emit code mostly independent of GLSL
IR, this isn't necessary any more.

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Refactor the texture emission logic into a single function.
Kenneth Graunke [Fri, 10 Oct 2014 09:41:20 +0000 (11:41 +0200)]
i965/fs: Refactor the texture emission logic into a single function.

Before, we had 3 different emit functions for various different gen's,
as well as some ancilliary work that was the same across all gen's which
was either contained in functions or duplicated across the GLSL IR and
Mesa IR backends. Now, we have a single method, emit_texture(), that
takes all the information needed to make a texture instruction and
handles all the setup, and all we have to do to emit a texture
instruction while converting from GLSL IR, Mesa IR, or any new backend
is to extract the information emit_texture() needs and then call it.

v2: Significant rebasing (by Ken).

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Make gather_channel() not use ir_texture.
Connor Abbott [Sat, 2 Aug 2014 01:08:08 +0000 (18:08 -0700)]
i965/fs: Make gather_channel() not use ir_texture.

Our new IR won't have ir_texture objects.

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Make swizzle_result() not use ir_texture.
Connor Abbott [Sat, 2 Aug 2014 01:05:37 +0000 (18:05 -0700)]
i965/fs: Make swizzle_result() not use ir_texture.

Our new IR won't have ir_texture objects.

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: fix integer textures with swizzles
Connor Abbott [Fri, 15 Aug 2014 17:22:20 +0000 (10:22 -0700)]
i965/fs: fix integer textures with swizzles

This happened to work before, but it would convert the output to a float
and then back to an integer which seems bad.

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: don't pass in ir_texture to emit_texture_*
Connor Abbott [Fri, 1 Aug 2014 23:47:58 +0000 (16:47 -0700)]
i965/fs: don't pass in ir_texture to emit_texture_*

At this point, the only thing it's used for is the opcode.

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: don't use ir->type in emit_texture_gen4()
Connor Abbott [Fri, 1 Aug 2014 23:30:26 +0000 (16:30 -0700)]
i965/fs: don't use ir->type in emit_texture_gen4()

We already have the type from the original destination.

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Don't use ir->lod_info.grad.dPd<x,y> in emit_texture_*.
Connor Abbott [Fri, 1 Aug 2014 23:24:44 +0000 (16:24 -0700)]
i965/fs: Don't use ir->lod_info.grad.dPd<x,y> in emit_texture_*.

This drops a dependency on ir_texture objects.

v2 (Ken): Rename lod_components to grad_components, as it only has a
          meaningful value for ir_txd.  We could set it to 1 for TXL,
          but there's no real need.

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Don't use ir->coordinate in emit_texture_*.
Connor Abbott [Fri, 1 Aug 2014 22:46:11 +0000 (15:46 -0700)]
i965/fs: Don't use ir->coordinate in emit_texture_*.

This drops a dependency on ir_texture objects.

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: make rescale_texcoord() not use ir_texture.
Connor Abbott [Fri, 1 Aug 2014 22:03:03 +0000 (15:03 -0700)]
i965/fs: make rescale_texcoord() not use ir_texture.

Our new IR won't have ir_texture objects, but using glsl_type is fine.

v2 (Ken): Drop redundant ir->coordinate NULL check; rebase.

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Make emit_mcs_fetch() not use ir_texture.
Connor Abbott [Fri, 1 Aug 2014 21:46:31 +0000 (14:46 -0700)]
i965/fs: Make emit_mcs_fetch() not use ir_texture.

Our new IR won't have ir_texture objects.

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Rename "length" to "components" in emit_mcs_fetch().
Kenneth Graunke [Mon, 1 Sep 2014 09:17:41 +0000 (02:17 -0700)]
i965/fs: Rename "length" to "components" in emit_mcs_fetch().

This is slightly clearer.  Based on a patch by Connor Abbott.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965: Make brw_texture_offset() not use ir_texture.
Connor Abbott [Mon, 4 Aug 2014 22:20:38 +0000 (15:20 -0700)]
i965: Make brw_texture_offset() not use ir_texture.

Our new IR won't have ir_texture objects.

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: don't use ir->offset in emit_texture_gen5.
Connor Abbott [Fri, 1 Aug 2014 21:13:31 +0000 (14:13 -0700)]
i965/fs: don't use ir->offset in emit_texture_gen5.

v2 (Ken): Refactor the Gen7 code separately; rebase.

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Move texel offset handling to visit(ir_texture *).
Kenneth Graunke [Mon, 1 Sep 2014 08:58:06 +0000 (01:58 -0700)]
i965/fs: Move texel offset handling to visit(ir_texture *).

This moves the handling of non-constant texel offset subexpression trees
to the place where we visit other such subtrees.  It also removes some
uses of ir->offset in emit_texture_gen7, which will be useful when we
write the backend for our new upcoming IR.

Based on a patch by Connor Abbott.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965: Drop ir->op != ir_txf condition in offset checking.
Kenneth Graunke [Mon, 1 Sep 2014 08:39:14 +0000 (01:39 -0700)]
i965: Drop ir->op != ir_txf condition in offset checking.

brw_lower_unnormalized_offset sets ir->offset to NULL if it applies the
texelFetchOffset workarounds, so there's no need to special case it
here---there won't be an offset for ir_txf.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965: Restore a lost comment about TXF offset bugs.
Kenneth Graunke [Mon, 1 Sep 2014 08:36:43 +0000 (01:36 -0700)]
i965: Restore a lost comment about TXF offset bugs.

Eric's original code to work around TXF offset bugs contained a comment
explaining the problem, which was lost when Chris generalized it to an
IR transformation (in commit 598ca510b8a118c3c7e18b5d031a2b116120e0a6).

This commit adds the original comment to the newer code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agofreedreno/ir3: large const support
Rob Clark [Wed, 15 Oct 2014 17:08:00 +0000 (13:08 -0400)]
freedreno/ir3: large const support

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno: update generated headers
Rob Clark [Wed, 15 Oct 2014 18:38:07 +0000 (14:38 -0400)]
freedreno: update generated headers

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno: fix layer_stride
Rob Clark [Wed, 15 Oct 2014 14:29:17 +0000 (10:29 -0400)]
freedreno: fix layer_stride

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno: inline fd_draw_emit()
Rob Clark [Wed, 15 Oct 2014 12:12:24 +0000 (08:12 -0400)]
freedreno: inline fd_draw_emit()

Manual LTO

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/ir3: optimize shader key comparision
Rob Clark [Tue, 14 Oct 2014 20:23:18 +0000 (16:23 -0400)]
freedreno/ir3: optimize shader key comparision

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx: refactor/optimize emit
Rob Clark [Tue, 14 Oct 2014 18:27:47 +0000 (14:27 -0400)]
freedreno/a3xx: refactor/optimize emit

Because we reuse various bits of emit code (for state/vertex/prog/etc)
for both regular draws and internal draws (gmem<->mem, clear, etc), the
number of parameters getting passed around has been growing.  Refactor
to group these into fd3_emit.  This simplifies fxn signatures, avoids
passing around shader key on the stack, etc.  It also gives us a nice
place to cache shader-variant lookup to avoid looking up shader variants
multiple times per draw (without having to *also* pass them around as
fxn args everywhere).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx: refactor vertex state emit
Rob Clark [Tue, 14 Oct 2014 16:20:54 +0000 (12:20 -0400)]
freedreno/a3xx: refactor vertex state emit

Get rid of fd3_vertex_buf and use fd_vertex_state directly for all
draws.  Removes a tiny bit of CPU overhead for munging around the vertex
state every time it is emitted, but more importantly it cleans things up
for later optimizations, so the emit paths don't have to special case
internal draws (gmem<->mem, clears, etc) with regular draws.

Instead of constructing fd3_vertex_buf array each time for internal
draws, and context init time pre-create solid_vbuf_state and
blit_vbuf_state.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agovc4: Fix the uniform debug output.
Eric Anholt [Wed, 15 Oct 2014 15:16:09 +0000 (16:16 +0100)]
vc4: Fix the uniform debug output.

I dropped the shader index when moving to the compiled shader struct, but
didn't update the format string here.

10 years agovc4: Add support for user clip plane and gl_ClipVertex.
Eric Anholt [Wed, 15 Oct 2014 14:25:57 +0000 (15:25 +0100)]
vc4: Add support for user clip plane and gl_ClipVertex.

Fixes about 15 piglit tests about interpolation and clipping.

10 years agovc4: Move the output semantics setup to a helper.
Eric Anholt [Wed, 15 Oct 2014 15:39:54 +0000 (16:39 +0100)]
vc4: Move the output semantics setup to a helper.

I want to reuse it elsewhere to set up outputs that aren't in the TGSI.

10 years agoi965: Allow CSE on Gen4-5 unary math.
Kenneth Graunke [Tue, 14 Oct 2014 06:45:07 +0000 (23:45 -0700)]
i965: Allow CSE on Gen4-5 unary math.

Due to the implicit move-from-GRF, unary math looks a lot like the Gen6+
math instruction: it's a single instruction (SEND) with a GRF source.
The difference is that it also implicitly clobbers a message register.

The only visible effect is that CSE will remove the MRF-clobbering from
later math operations.  This should be fine; compute_to_mrf and
remove_redundant_mrf_writes don't look at the values populated by
implied writes, so they can't rely on those values being present.
Less interference may actually help those passes make more progress.

Binary math is still problematic, since it involves a separate MOV
instruction to load the second operand.  We continue disallowing CSE for
binary math operations.

total instructions in shared programs: 3340303 -> 3340100 (-0.01%)
instructions in affected programs:     26927 -> 26724 (-0.75%)
Nothing hurt, gained, or lost.  ~6% reduction on a few shaders.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>