git.libre-soc.org Git - mesa.git/log

projects / mesa.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Rob Clark [Sat, 25 Oct 2014 19:11:59 +0000 (15:11 -0400)]

freedreno/ir3: simplify RA

Group inputs/outputs, in addition to fanin/fanout, as they must also
exist in sequential scalar registers.  This lets us simplify RA by
working in terms of neighbor groups.

NOTE: has the slight problem that it can't optimize out mov's for things
like:

  MOV OUT[n], IN[m]

To avoid this, instead of trying to figure out what mov's we can
eliminate, we first remove all mov's prior to grouping, and then
re-insert mov's as needed while grouping inputs/outputs/fanins.
Eventually we'd prefer the frontend to not insert extra mov's in the
first place (so we don't have to bother removing them).  This is the
plan for an eventual NIR based frontend, so separate out the instr
grouping (which will still be needed for NIR frontend) from the mov
elimination (which won't).

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Fri, 2 Jan 2015 18:44:26 +0000 (13:44 -0500)]

freedreno/ir3: regmask support for relative addr

For temp arrays, a 32bit mask won't be sufficient.. but otoh we don't
need to support an arbitrary mask. So for this case use a simple size
field rather than a bitmask.

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Wed, 31 Dec 2014 01:00:40 +0000 (20:00 -0500)]

freedreno/ir3: split up ssa_src

Slight bit of refactoring that will be needed for indirect gpr
addressing (TEMP[ADDR[]]).

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Thu, 1 Jan 2015 05:56:43 +0000 (00:56 -0500)]

freedreno/ir3: drop instr_clone() stuff

Unnecessary and overly complicated. And gets in the way for temp arrays
(TEMP[ADDR[]]).

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Fri, 2 Jan 2015 14:28:23 +0000 (09:28 -0500)]

freedreno/ir3: runtime enable RA debug for DEBUG builds

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Fri, 2 Jan 2015 14:26:38 +0000 (09:26 -0500)]

freedreno/ir3: handle relative addr in ir3_dump

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Tue, 6 Jan 2015 21:44:26 +0000 (16:44 -0500)]

freedreno/ir3: legalize vs unused sam dst components

We probably could be more clever elsewhere and mask out components that
are not used. But either way, legalize should realize that there is
also a write-after-write hazard with texture sample instructions.

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Wed, 31 Dec 2014 00:56:56 +0000 (19:56 -0500)]

freedreno/ir3: hack for old compiler

Old compiler doesn't have ir3_block's.. so we need a special path. This
hack can be dropped when ir3_compiler_old is retired.

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Sun, 4 Jan 2015 21:33:37 +0000 (16:33 -0500)]

tgsi: track max array per file

NOTE IN[] and OUT[] don't need (have?) ArrayID's.. and TEMP[] can
optionally have them. So we implicitly assume that ArrayID==0 always
exists for each file. This is why array_max[file] is never less than
zero.

You can tell from indirect_files(_read/written) if the legacy array-
id zero was actually used.

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Sat, 3 Jan 2015 21:11:28 +0000 (16:11 -0500)]

tgsi: keep track of read vs written indirects

At least temporarily, I need to fallback to old compiler still for
relative dest (for freedreno), but I can do relative src temp. Only
a temporary situation, but seems easy/reasonable for tgsi-scan to
track this.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>

commit | commitdiff | tree

Marek Olšák [Wed, 7 Jan 2015 23:10:18 +0000 (00:10 +0100)]

Revert "radeonsi: reduce the size of si_pm4_state"

This reverts commit 9141d8855555e45a057970e78969e1518ad3617d.

It broke OpenCL.

commit | commitdiff | tree

Tom Stellard [Wed, 7 Jan 2015 18:49:12 +0000 (13:49 -0500)]

radeonsi: Fix crash when destroying si_screen

We were invalidating si_screen:tm by calling
r600_destroy_common_screen() which frees the si_screen object. This
caused the driver to crash in LLVMDisposeTargetMachine() since we
were passing it an invalid pointer.

https://bugs.freedesktop.org/show_bug.cgi?id=88170

commit | commitdiff | tree

José Fonseca [Wed, 7 Jan 2015 14:27:12 +0000 (14:27 +0000)]

mesa: Don't use _mesa_generic_nop on Windows.

It doesn't work on Windows because of STDCALL calling convention -- it's
the callee responsibility to pop the arguments, and the number of
arguments vary with the prototype --, so the stack pointer ends up getting
corrupted.

This is just a non-invasive stop-gap fix. A proper fix would be more
elaborate, and require either:
- a variation of __glapi_noop_table which sets GL_INVALID_OPERATION
error
- stop using APIENTRY on all internal _mesa_* functions.

Tested with piglit gl-1.0-beginend-coverage (it now fails instead of
crashing).

VMware PR1350505

Reviewed-by: Brian Paul <brianp@vmware.com>

commit | commitdiff | tree

José Fonseca [Wed, 7 Jan 2015 14:24:07 +0000 (14:24 +0000)]

glapi: Force frame pointer elimination on Windows.

To catch mismatches in cdecl vs stdcall calling convention. See code
comment for more detailed explanation.

Tested with piglit gl-1.0-beginend-coverage (it now also crashes on
debug builds.)

VMware PR1350505.

Reviewed-by: Brian Paul <brianp@vmware.com>

commit | commitdiff | tree

Marek Olšák [Sun, 4 Jan 2015 16:08:57 +0000 (17:08 +0100)]

radeonsi: enable LLVM optimizations that assume no NaNs for non-compute shaders

v2: complete rewrite

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

commit | commitdiff | tree

Marek Olšák [Tue, 30 Dec 2014 17:41:25 +0000 (18:41 +0100)]

radeonsi: emit SURFACE_SYNC last

This fixes a case where a transform feedback buffer is fed back as an index
buffer, because SURFACE_SYNC must be after VS_PARTIAL_FLUSH.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Mon, 29 Dec 2014 14:09:22 +0000 (15:09 +0100)]

radeonsi: flush all CB/DB caches unconditionally when changing the framebuffer

This is easier to read and will work better with shader image stores.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Mon, 29 Dec 2014 00:25:48 +0000 (01:25 +0100)]

radeonsi: change TC cache flushing strategy for textures

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Tue, 30 Dec 2014 15:45:51 +0000 (16:45 +0100)]

radeonsi: improve and fix streamout flushing

- we don't usually need to flush TC L2
- we should flush KCACHE
  (not really an issue now since we always flush KCACHE when updating
   descriptors, but it could be a problem if we used CE, which doesn't
   require flushing KCACHE)
- add an explicit VS_PARTIAL_FLUSH flag

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Mon, 29 Dec 2014 13:53:11 +0000 (14:53 +0100)]

radeonsi: use TC L2 for CP DMA operations with shader resources on CIK

So that TC L2 doesn't need to be flushed.

The only problem is with index buffers, which don't use TC.
A simple solution is added that flushes TC L2 before a draw call (TC_L2_dirty).

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Mon, 29 Dec 2014 12:22:00 +0000 (13:22 +0100)]

radeonsi: use TC L2 for updating descriptors on CIK

This allows not flushing TC L2 on CIK later.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Sun, 4 Jan 2015 21:16:53 +0000 (22:16 +0100)]

radeonsi: don't use TC L2 for updating descriptors on SI

It's causing problems, because we mix uncached CP DMA with cached WRITE_DATA
when updating the same memory.

The solution for SI is to use uncached access here, because CP DMA doesn't
support cached access.

CIK will be handled in the next patch.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Mon, 29 Dec 2014 13:45:49 +0000 (14:45 +0100)]

radeonsi: only flush the right set of caches for CP DMA operations

That's either framebuffer caches or caches for shader resources.
The motivation is that framebuffer caches need to be flushed very rarely
here.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Sun, 28 Dec 2014 22:11:38 +0000 (23:11 +0100)]

radeonsi: implement separate ICACHE and KCACHE flush for SI

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Tue, 30 Dec 2014 12:08:32 +0000 (13:08 +0100)]

radeonsi: add a combined flag for flushing a framebuffer

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Mon, 29 Dec 2014 13:02:46 +0000 (14:02 +0100)]

radeonsi: rename flush flags, split the TC flag into L1 and L2

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Mon, 29 Dec 2014 12:39:42 +0000 (13:39 +0100)]

r600g,radeonsi: separate cache flush flags

I will rename them for radeonsi.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Mon, 29 Dec 2014 12:27:46 +0000 (13:27 +0100)]

r600g: move r6xx-specific streamout flush flagging into r600g

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Sun, 4 Jan 2015 21:01:43 +0000 (22:01 +0100)]

radeonsi: only set BC_OPTIMIZE_DISABLE when necessary

SPI_PS_IN_CONTROL is moved into the SPI mapping state.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Sun, 4 Jan 2015 20:05:14 +0000 (21:05 +0100)]

radeonsi: do not define FACE as an ordinary PS input

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Sun, 4 Jan 2015 19:23:51 +0000 (20:23 +0100)]

radeonsi: remove flatshade from the shader key

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Sun, 4 Jan 2015 19:09:51 +0000 (20:09 +0100)]

radeonsi: remove special handling of TGSI_INTERPOLATE_COLOR in shader codegen

It doesn't do anything useful. And colors are floating-point, so we can use
fs.interp, remove "flatshade" from the shader key, and rely on the FLAT_SHADE
state only (in the next patch).

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Sun, 4 Jan 2015 13:51:01 +0000 (14:51 +0100)]

radeonsi: implement VERTEXID_NOBASE and BASEVERTEX system values

Only done for completeness. Not used by anything yet.

Tested by advertising PIPE_CAP_VERTEXID_NOBASE.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Sun, 4 Jan 2015 13:41:49 +0000 (14:41 +0100)]

radeonsi: fix VertexID for OpenGL

This fixes all failing piglit VertexID tests.

Cc: 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Sun, 28 Dec 2014 20:51:35 +0000 (21:51 +0100)]

radeonsi: clarify a hw bug in shader exports

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Sun, 4 Jan 2015 19:45:35 +0000 (20:45 +0100)]

radeonsi: use ordered compares for SSG and face selection

Ordered compares are what you have in C. Unordered compares are the result
of negating ordered compares (they return true if either argument is NaN).

That special NaN behavior is completely useless here, and unordered
compares produce horrible code with all stable LLVM versions.
(I think that has been fixed in LLVM git)

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Tue, 30 Dec 2014 23:51:27 +0000 (00:51 +0100)]

radeonsi: remove unused and not useful variables

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Tue, 30 Dec 2014 23:42:22 +0000 (00:42 +0100)]

radeonsi: remove init config from states

It really doesn't do anything there.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Tue, 30 Dec 2014 22:49:59 +0000 (23:49 +0100)]

radeonsi: reduce the size of si_pm4_state

- the relocs array is unused, remove it
- ndw is at most 115 (init), set 140 as the maximum
- compute needs 4 buffers per state, graphics only needs 1; set 4 as the maximum

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Sun, 4 Jan 2015 20:58:42 +0000 (21:58 +0100)]

tgsi: add uses_centroid into tgsi_shader_info

commit | commitdiff | tree

Marek Olšák [Sun, 4 Jan 2015 14:43:47 +0000 (15:43 +0100)]

st/mesa: fix GL_PRIMITIVE_RESTART_FIXED_INDEX

Cc: 10.2 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Marek Olšák [Sun, 4 Jan 2015 13:27:33 +0000 (14:27 +0100)]

vbo: ignore primitive restart if FixedIndex is enabled in DrawArrays

From GL 4.4 Core profile:

  If both PRIMITIVE_RESTART and PRIMITIVE_RESTART_FIXED_INDEX are
  enabled, the index value determined by PRIMITIVE_RESTART_FIXED_INDEX is
  used. If PRIMITIVE_RESTART_FIXED_INDEX is enabled, primitive restart is not
  performed for array elements transferred by any drawing command not taking a
  type parameter, including all of the *Draw* commands other than *DrawEle-
  ments*.

Cc: 10.2 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Eric Anholt [Tue, 6 Jan 2015 19:30:19 +0000 (11:30 -0800)]

vc4: Fix scaling W projection of the Z coordinate when there's a Z offset.

Fixes piglit glsl-fs-fragcoord-zw-perspective, es3conform
gl_FragCoord_z_frag, and the rest of the piglit glsl 1.10 interpolation
tests.

commit | commitdiff | tree

Eric Anholt [Tue, 6 Jan 2015 00:34:58 +0000 (16:34 -0800)]

vc4: Fix deletion from the program cache.

They key is, oddly enough, in the key field, not in the data field (which
is the vc4_compiled_shader *). Fixes regular failures in fp-long-alu.

commit | commitdiff | tree

Eric Anholt [Sat, 3 Jan 2015 06:55:37 +0000 (22:55 -0800)]

vc4: Skip storing the Z/S contents when it's invalidated.

Improves framerate of 5 seconds of es2gears by 1.57473% +/- 0.669409%
(n=67).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

commit | commitdiff | tree

Eric Anholt [Sun, 21 Dec 2014 20:48:59 +0000 (12:48 -0800)]

gallium: Plumb the swap INVALIDATE_ANCILLARY flag through more layers.

v2: Instead of telling the driver that the window system ancillaries have
    been invalidated (when the driver doesn't know which of its buffers
    are the window system's!), introduce a method for invalidating
    specific surfaces.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

commit | commitdiff | tree

Eric Anholt [Sun, 21 Dec 2014 19:51:33 +0000 (11:51 -0800)]

egl: Inform the client API when ancillary buffers may become undefined.

This is part of the EGL spec, and is useful for a tiled renderer to avoid
the memory bandwidth cost of storing the depth/stencil buffers.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

commit | commitdiff | tree

Vinson Lee [Mon, 5 Jan 2015 22:53:03 +0000 (14:53 -0800)]

ax_prog_flex.m4: Merge upstream OpenBSD fixes.

Merge the following upstream autoconf-archive patches.

ax_prog_flex: change grep syntax to accept e.g. "flex.real" in case a wrapper or symlink is used.
AX_PROG_FLEX: avoid use of grep empty string escape extension (fix for OpenBSD)
AX_PROG_FLEX: Also accept gflex.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jonathan Gray <jsg@openbsd.org>

commit | commitdiff | tree

Tom Stellard [Tue, 23 Dec 2014 15:26:23 +0000 (10:26 -0500)]

radeon/llvm: Use amdgcn triple for SI+ on LLVM >= 3.6

commit | commitdiff | tree

Tom Stellard [Wed, 15 Oct 2014 16:24:30 +0000 (12:24 -0400)]

radeonsi: Cache LLVMTargetMachine object in si_screen

Rather than building a new one every compile. This should reduce some
of the overhead of compiling shaders.

One consequence of this change is that we lose the MachineInstrs dumps
when dumping the shaders via R600_DEBUG. The LLVM IR and assembly is
still dumped, and if you still want to see the MachineInstr dump, you
can run the dumped LLVM IR through llc.

commit | commitdiff | tree

Brian Paul [Fri, 2 Jan 2015 23:56:12 +0000 (16:56 -0700)]

mesa: create, use new _mesa_texture_base_format() function

Reviewed-by: Eric Anholt <eric@anholt.net>

commit | commitdiff | tree

Brian Paul [Fri, 2 Jan 2015 23:56:12 +0000 (16:56 -0700)]

mesa: remove unused ctx parameter for _mesa_select_tex_image()

Reviewed-by: Eric Anholt <eric@anholt.net>

commit | commitdiff | tree

Brian Paul [Fri, 2 Jan 2015 23:56:12 +0000 (16:56 -0700)]

swrast: use new _mesa_base_tex_image() helper

Reviewed-by: Eric Anholt <eric@anholt.net>

commit | commitdiff | tree

Brian Paul [Fri, 2 Jan 2015 23:56:12 +0000 (16:56 -0700)]

st/mesa: use new _mesa_base_tex_image() helper

This involved adding a new st_texture_image_const() helper also.

Reviewed-by: Eric Anholt <eric@anholt.net>

commit | commitdiff | tree

Brian Paul [Fri, 2 Jan 2015 23:56:12 +0000 (16:56 -0700)]

mesa: add _mesa_base_tex_image() helper function

Reviewed-by: Eric Anholt <eric@anholt.net>

commit | commitdiff | tree

Brian Paul [Fri, 2 Jan 2015 23:56:12 +0000 (16:56 -0700)]

mesa: simplify a conditional in detach_shader()

Reviewed-by: Eric Anholt <eric@anholt.net>

commit | commitdiff | tree

Brian Paul [Fri, 2 Jan 2015 23:56:12 +0000 (16:56 -0700)]

mesa: minor whitespace fixes in shaderapi.c

Reviewed-by: Eric Anholt <eric@anholt.net>

commit | commitdiff | tree

Brian Paul [Fri, 2 Jan 2015 23:56:12 +0000 (16:56 -0700)]

mesa: make _mesa_reference_shader_program() an inline function

which wraps _mesa_reference_shader_program_(), similar to what we do
for other reference-counted objects.

Reviewed-by: Eric Anholt <eric@anholt.net>

commit | commitdiff | tree

Brian Paul [Fri, 2 Jan 2015 23:56:12 +0000 (16:56 -0700)]

mesa: update comment on delete_shader_program()

Reviewed-by: Eric Anholt <eric@anholt.net>

commit | commitdiff | tree

Brian Paul [Fri, 2 Jan 2015 23:56:12 +0000 (16:56 -0700)]

mesa: rearrange error handling in glProgramParameteri()

Reviewed-by: Eric Anholt <eric@anholt.net>

commit | commitdiff | tree

Brian Paul [Fri, 2 Jan 2015 23:56:12 +0000 (16:56 -0700)]

mesa: fix error strings in shaderapi.c

The _mesa_-prefixed function names should not appear in GL error
messages.

Reviewed-by: Eric Anholt <eric@anholt.net>

commit | commitdiff | tree

Brian Paul [Fri, 2 Jan 2015 23:19:48 +0000 (16:19 -0700)]

glsl: use the is_gl_identifier() helper in a couple more places

Reviewed-by: Eric Anholt <eric@anholt.net>

commit | commitdiff | tree

Brian Paul [Fri, 19 Dec 2014 21:26:57 +0000 (14:26 -0700)]

meta: init var to silence uninitialized variable warning

commit | commitdiff | tree

Brian Paul [Fri, 19 Dec 2014 16:37:33 +0000 (09:37 -0700)]

draw: silence uninitialized variable warning

v2: move initialization of llvm_gs to declaration.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>

commit | commitdiff | tree

Brian Paul [Fri, 19 Dec 2014 16:36:51 +0000 (09:36 -0700)]

gallivm: silence a couple compiler warnings

Silence warnings about possibly uninitialized variables when making a
release build.

Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit | commitdiff | tree

Leonid Shatz [Wed, 31 Dec 2014 18:07:44 +0000 (19:07 +0100)]

gallium/util: make sure cache line size is not zero

The "normal" detection (querying clflush size) already made sure it is
non-zero, however another method did not. This lead to crashes if this
value happened to be zero (apparently can happen in virtualized environments
at least).
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=87913

Cc: "10.4" <mesa-stable@lists.freedesktop.org>

commit | commitdiff | tree

Roland Scheidegger [Wed, 31 Dec 2014 16:39:57 +0000 (17:39 +0100)]

gallium/util: fix crash with daz detection on x86

The code used PIPE_ALIGN_VAR for the variable used by fxsave, however this
does not work if the stack isn't aligned. Hence use PIPE_ALIGN_STACK function
decoration to fix the segfault which can happen if stack alignment is only
4 bytes.
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=87658.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>

commit | commitdiff | tree

Ilia Mirkin [Mon, 5 Jan 2015 05:33:58 +0000 (00:33 -0500)]

nvc0: add name to magic number

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Ilia Mirkin [Wed, 31 Dec 2014 03:27:57 +0000 (22:27 -0500)]

nvc0: regenerate rnndb headers

The headers hadn't been regenerated in a long time and had seen a number
of manual modifications. A few changes:
- remove nvc0_2d entirely, use the nv50 header which has the nvc0
values too
- remove 3ddefs, it's identical to the nv50 file
- move macros out into a separate file

Also the upstream rnndb changed the overall chip naming convention; this
was fixed up manually in the generated files until a better solution is
determined.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Ilia Mirkin [Wed, 31 Dec 2014 02:19:14 +0000 (21:19 -0500)]

nv50: regenerate rnndb headers

The headers hadn't been regenerated in a long time, and there were a few
minor divergences. Among other things, rnndb has changed naming to
G80/etc, for now I've not tackled switching that over and manually
replaced the nvidia codenames back to the chip ids. However no other
modifications of the headergen'd headers was done.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Tobias Klausmann [Sat, 3 Jan 2015 00:00:08 +0000 (01:00 +0100)]

nv50: enable texture compression

Compression seems to be supported for only some formats. Enable it for
those. Previously this was disabled for everything despite the code
looking like it was actually enabled.

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Ilia Mirkin [Mon, 5 Jan 2015 00:32:18 +0000 (19:32 -0500)]

nv50/ir: enable sat modifier for OP_SUB

SUB is handled the same as ADD, so no reason not to allow a saturate
modifier on it.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Roy Spliet [Sun, 4 Jan 2015 23:22:17 +0000 (00:22 +0100)]

nv50/ir: Add sat modifier for mul

Signed-off-by: Roy Spliet <rspliet@eclipso.eu>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Ilia Mirkin [Mon, 5 Jan 2015 05:17:26 +0000 (00:17 -0500)]

nv50,nvc0: avoid doing work inside of an assert

assert is compiled out in release builds - don't put logic into it. Note
that this particular instance is only used for vp debugging and is
normally compiled out.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Ilia Mirkin [Sun, 4 Jan 2015 23:03:20 +0000 (18:03 -0500)]

nv50/ir: fix texture offsets in release builds

assert's get compiled out in release builds, so they can't be relied
upon to perform logic.

Reported-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Roy Spliet <rspliet@eclipso.eu>
Cc: "10.2 10.3 10.4" <mesa-stable@lists.freedesktop.org>

commit | commitdiff | tree

Kenneth Graunke [Thu, 31 Jul 2014 08:26:30 +0000 (01:26 -0700)]

i965: Micro-optimize swizzle_to_scs() and make it inlinable.

brw_swizzle_to_scs has been showing up in my CPU profiling, which is
rather silly - it's a tiny amount of code.  It really should be inlined,
and can easily be implemented with fewer instructions.

The enum translation is as follows:

SWIZZLE_X, SWIZZLE_Y, SWIZZLE_Z, SWIZZLE_W, SWIZZLE_ZERO, SWIZZLE_ONE
        0          1          2          3             4            5
        4          5          6          7             0            1
  SCS_RED, SCS_GREEN,  SCS_BLUE, SCS_ALPHA,     SCS_ZERO,     SCS_ONE

which is simply (swizzle + 4) & 7.

Haswell needs extra textureGather workarounds to remap GREEN to BLUE,
but Broadwell and later do not.

This patch replicates swizzle_to_scs in gen7_wm_surface_state.c and
gen8_surface_state.c, since the Gen8+ code can be simplified to a mere
two instructions.  Both copies can be marked static for easy inlining.

v2: Put the commit message in the code as comments (requested by
    Jason Ekstrand).  Also fix a typo.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

commit | commitdiff | tree

Kenneth Graunke [Thu, 1 Jan 2015 04:31:26 +0000 (20:31 -0800)]

i965: Support MESA_FORMAT_R8G8B8X8_SRGB.

Valve games use GL_SRGB8 textures.  Instead of supporting that properly,
we fell back to MESA_FORMAT_R8G8B8A8_SRGB (with an alpha channel), which
meant that we had to use texture swizzling to override the alpha to 1.0
when sampling.  This meant shader recompiles on Gen < 7.5 platforms.

By supporting MESA_FORMAT_R8G8B8X8_SRGB, the hardware just returns 1.0
for us, so we can just use SWIZZLE_XYZW, and avoid any recompiles.  All
generations of hardware have supported the format for sampling and
filtering; we can easily support rendering by using the R8G8B8A8_SRGB
format and writing garbage to the X channel.  (We do this already for
the non-SRGB version of this format.)

This removes all remaining shader recompiles in a time demo of "Counter
Strike: Global Offensive" (32 -> 0) on Sandybridge.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87886
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>

commit | commitdiff | tree

Kenneth Graunke [Thu, 1 Jan 2015 05:51:05 +0000 (21:51 -0800)]

i965: Fix BLORP sRGB MSAA overrides to cope with X vs. A formats.

The logic in brw_blorp_surface_info::set uses brw_format_for_mesa_format
for source surfaces, and brw->render_target_format[] for destination
surfaces. We should do the same in the sRGB MSAA overrides.

Currently, this isn't a problem, since SRGB MSAA buffers are all RGBA.
The next commit will introduce RGBX SRGB MSAA buffers, at which point
we need to get the RGBX -> RGBA format overrides for rendering right.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>

commit | commitdiff | tree

Kenneth Graunke [Thu, 1 Jan 2015 01:38:05 +0000 (17:38 -0800)]

i965: Copy shader->shadow_samplers to prog->ShadowSamplers.

ir_to_mesa does this - apparently we just forgot or something.

Without this, we'll guess the wrong texture swizzle (XYZW for color
instead of XXX1 for depth) when doing precompiles.

This cuts 26 shader recompiles in a time demo of "Counter Strike:
Global Offensive" (58 -> 32) on Sandybridge. Haswell still has 0
recompiles.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87886
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>

commit | commitdiff | tree

Kenneth Graunke [Thu, 1 Jan 2015 02:06:41 +0000 (18:06 -0800)]

i965: Make the precompile ignore DEPTH_TEXTURE_MODE on Gen7.5+.

Gen7.5+ platforms that support the "Shader Channel Select" feature leave
key->tex.swizzles[i] as SWIZZLE_NOOP except when GL_DEPTH_TEXTURE_MODE
is GL_ALPHA (which is really uncommon). So, the precompile should leave
them as SWIZZLE_NOOP (aka SWIZZLE_XYZW) as well.

We didn't notice this because prog->ShadowSamplers is not set correctly.
The next patch will fix that problem.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87886
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>

commit | commitdiff | tree

Kenneth Graunke [Wed, 12 Nov 2014 19:17:55 +0000 (11:17 -0800)]

i965: Implement WaCsStallAtEveryFourthPipecontrol on IVB/BYT.

According to the documentation, we need to do a CS stall on every fourth
PIPE_CONTROL command to avoid GPU hangs. The kernel does a CS stall
between batches, so we only need to count the PIPE_CONTROLs in our batches.

v2: Get the generation check right (caught by Chris Wilson),
combine the ++ with the check (suggested by Daniel Vetter).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

commit | commitdiff | tree

Marek Olšák [Sun, 4 Jan 2015 22:53:23 +0000 (23:53 +0100)]

r300g: handle vertex format PIPE_FORMAT_NONE

commit | commitdiff | tree

Marek Olšák [Fri, 2 Jan 2015 13:13:43 +0000 (14:13 +0100)]

glsl_to_tgsi: fix a bug in copy propagation

This fixes the new piglit test: arb_uniform_buffer_object/2-buffers-bug

Cc: 10.2 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit | commitdiff | tree

Kenneth Graunke [Wed, 3 Dec 2014 07:44:30 +0000 (23:44 -0800)]

i965: Make INTEL_DEBUG=state ignore state flags with a count of 1.

There are too many state flags to fit in one terminal screen, even with
a very tall terminal. Everything is flagged once, so a value of 1 means
that it hasn't ever happened again, and thus isn't terribly interesting.

Skipping those makes it easier to see the interesting values.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Kenneth Graunke [Thu, 1 Jan 2015 00:54:44 +0000 (16:54 -0800)]

i965: Fix INTEL_DEBUG=optimizer with VF types.

Hardcoding stderr is wrong; INTEL_DEBUG=optimizer uses other files.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Kenneth Graunke [Thu, 1 Jan 2015 00:47:25 +0000 (16:47 -0800)]

i965: Show opt_vector_float() and later passes in INTEL_DEBUG=optimizer.

In order to support calling opt_vector_float() inside a condition, this
patch makes OPT() a statement expression:

https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html

We've used that elsewhere already.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Jeremy Huddleston Sequoia [Fri, 2 Jan 2015 03:54:41 +0000 (19:54 -0800)]

swrast: Fix -Wduplicate-decl-specifier warning

swrast.c:67:12: warning: duplicate 'const' declaration specifier [-Wduplicate-decl-specifier]
const char const *swrast_vendor_string = "Mesa Project";
^
swrast.c:68:12: warning: duplicate 'const' declaration specifier [-Wduplicate-decl-specifier]
const char const *swrast_renderer_string = "Software Rasterizer";
^

Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>

commit | commitdiff | tree

Roy Spliet [Fri, 2 Jan 2015 02:28:50 +0000 (03:28 +0100)]

nv50/ir: Fold sat into mad

The mad instruction emitter already supported the saturate modifier,
but the ModifierFolding pass never tried folding cvt sat operations
in for NV50.

Signed-off-by: Roy Spliet <rspliet@eclipso.eu>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Ilia Mirkin [Thu, 1 Jan 2015 06:01:13 +0000 (01:01 -0500)]

nv50/ir: fold MAD when one of the multiplicands is const

Fold MAD dst, src0, immed, src2 (or src0/immed swapped) when
- immed = 0 -> MOV dst, src2
- immed = +/- 1 -> ADD dst, src0, src2

These types of MAD patterns were observed in some st/nine shaders.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Alexander von Gluck IV [Mon, 29 Dec 2014 21:51:46 +0000 (21:51 +0000)]

gallium/state_tracker: Rewrite Haiku's state tracker

* More gallium-like
* Leverage stamps properly and don't call mesa functions

commit | commitdiff | tree

Marek Olšák [Wed, 31 Dec 2014 00:03:30 +0000 (01:03 +0100)]

radeonsi: fix warnings

commit | commitdiff | tree

Kenneth Graunke [Thu, 18 Dec 2014 12:45:40 +0000 (04:45 -0800)]

i965: Fix start/base_vertex_location for >1 prims but !BRW_NEW_VERTICES.

This is a partial revert of c89306983c07e5a88c0d636267e5ccf263cb4213.
It split the {start,base}_vertex_location handling into several steps:

1. Set brw->draw.start_vertex_location = prim[i].start
   and brw->draw.base_vertex_location = prim[i].basevertex.
   (This happened once per _mesa_prim, in the main drawing loop.)
2. Add brw->vb.start_vertex_bias and brw->ib.start_vertex_offset
   appropriately.  (This happened in brw_prepare_shader_draw_parameters,
   which was called just after brw_prepare_vertices, as part of state
   upload, and only happened when BRW_NEW_VERTICES was flagged.)
3. Use those values when emitting 3DPRIMITIVE (once per _mesa_prim).

If we drew multiple _mesa_prims, but didn't flag BRW_NEW_VERTICES on
the second (or later) primitives, we would do step #1, but not #2.
The first _mesa_prim would get correct values, but subsequent ones
would only get the first half of the summation.

The reason I originally did this was because I needed the value of
gl_BaseVertexARB to exist in a buffer object prior to uploading
3DSTATE_VERTEX_BUFFERS.  I believed I wanted to upload the value
of 3DPRIMITIVE's "Base Vertex Location" field, which was computed
as: (prims[i].indexed ? prims[i].start : prims[i].basevertex) +
brw->vb.start_vertex_bias.  The latter value wasn't available until
after brw_prepare_vertices, and the former weren't available in the
state upload code at all.  Hence the awkward split.

However, I believe that including brw->vb.start_vertex_bias was a
mistake.  It's an extra bias we apply when uploading vertex data into
VBOs, to move [min_index, max_index] to [0, max_index - min_index].

>From the GL_ARB_shader_draw_parameters specification:
"<gl_BaseVertexARB> holds the integer value passed to the <baseVertex>
parameter to the command that resulted in the current shader
invocation.  In the case where the command has no <baseVertex>
parameter, the value of <gl_BaseVertexARB> is zero."

I conclude that gl_BaseVertexARB should only include the baseVertex
parameter from glDraw*Elements*, not any internal biases we add for
optimization purposes.

With that in mind, gl_BaseVertexARB only needs prim[i].start or
prim[i].basevertex.  We can simply store that, and go back to computing
start_vertex_location and base_vertex_location in brw_emit_prim(), like
we used to.  This is much simpler, and should actually fix two bugs.

Fixes missing geometry in Unvanquished.

Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85529
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>

commit | commitdiff | tree

Kenneth Graunke [Tue, 30 Dec 2014 20:21:03 +0000 (12:21 -0800)]

i965: Use WARN_ONCE for the single-primitive-exceeded-aperture message.

This makes it show up via ARB_debug_output and is also less code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

commit | commitdiff | tree

Eric Anholt [Tue, 30 Dec 2014 23:39:20 +0000 (15:39 -0800)]

u_primconvert: Fix leak of the upload BO on context destroy.

v2: Conditionalize it on having done any uploads (Turns out
u_upload_destroy() isn't safe with a NULL arg).

Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)

commit | commitdiff | tree

Eric Anholt [Wed, 31 Dec 2014 00:10:28 +0000 (16:10 -0800)]

vc4: Fix memory leak as of 0404e7fe0ac2a6234a11290b4b1596e8bc127a4b.

Can't reset the CL before looking at how much we had pupt in it.

commit | commitdiff | tree

Ilia Mirkin [Wed, 31 Dec 2014 04:19:47 +0000 (23:19 -0500)]

nv50,nvc0: set vertex id base to index_bias

Fixes the piglits which check that gl_VertexID includes the base vertex
offset:
arb_draw_indirect-vertexid elements
gl-3.2-basevertex-vertexid

Note that this leaves out the original G80, for which this will continue
to fail. It could be fixed by passing a driver constbuf value in, but
that's beyond the scope of this change.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>

commit | commitdiff | tree

Tiziano Bacocco [Tue, 30 Dec 2014 20:33:48 +0000 (21:33 +0100)]

nv50,nvc0: implement half_pixel_center

LAST_LINE_PIXEL has actually been renamed to PIXEL_CENTER_INTEGER in
rnndb; use that method to implement the rasterizer setting, used for
st/nine.

Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>

commit | commitdiff | tree

Eric Anholt [Sun, 28 Dec 2014 18:14:19 +0000 (08:14 -1000)]

vc4: Only render tiles where the scissor ever intersected them.

This gives a 2.7x improvement in x11perf -rect100, since we only end up
load/storing the x11perf window, not the whole screen.

commit | commitdiff | tree

Eric Anholt [Tue, 30 Dec 2014 20:12:15 +0000 (12:12 -0800)]

vc4: Move draw call reset handling to a helper function.

This will be more important in the next commit, when there's more state to
reset to nonzero values, and I want an early exit from the submit
function.

commit | commitdiff | tree

Eric Anholt [Fri, 26 Dec 2014 02:24:15 +0000 (16:24 -1000)]

vc4: Drop the content of vc4_flush_resource().

The callers all follow it with a flush of the context, and the flush of
the context gives us more information about how things are being flushed.