git.libre-soc.org Git - mesa.git/log

projects / mesa.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Kenneth Graunke [Tue, 30 Sep 2014 08:15:56 +0000 (01:15 -0700)]

i965: Use BDW_MOCS_PTE for renderbuffers.

Write-back caching cannot be used for buffers being scanned out by the
display engine; surfaces used for scan-out must be write-through or
uncached.  I originally chose WT for render targets because it works in
all cases.  However, we really want to use write-back caching where
possible, as it is more efficient.

Most renderbuffers are not used for scanout - off-screen FBOs certainly
are fine, and non-pageflipped backbuffers should be fine as well.  So
in most cases WB will work.  However, we don't know what will be used
for scan-out, so we instead simply use the PTE value specified by the
kernel, as it knows these things.

This matches our MOCS choice on Haswell.

Fixes performance regressions since commit ee4484be3dc827cf15bcf109f5
in a microbenchmark (spotted by Eero Tamminen).  Improves performance
in GLBenchmark 2.7/EgyptHD by 7.44362% +/- 0.496939% (n=55) on a
Broadwell GT2.  Improves performance in a bunch of other microbenchmarks
by ~15% or so.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reported-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Cc: mesa-stable@lists.freedesktop.org

commit | commitdiff | tree

Kenneth Graunke [Tue, 30 Sep 2014 08:15:55 +0000 (01:15 -0700)]

i965: Add a BRW_MOCS_PTE #define.

Like BDW_MOCS_WB and BDW_MOCS_WT, this specifies that we want to use all
three caches (L3, LLC, and eLLC where available), but leaves the LLC
caching mode up to the kernel's page table entry.

This allows the kernel to pick WB/WT/UC based on whether it's using a
buffer for scanout.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Cc: mesa-stable@lists.freedesktop.org

commit | commitdiff | tree

Kenneth Graunke [Sat, 27 Sep 2014 05:02:50 +0000 (22:02 -0700)]

mesa: Make _mesa_print_arrays use stderr.

These days, most driver debug output happens via stderr, not stdout.
Some applications (such as Xephyr) also appear to close stdout which
makes these messages go nowhere.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>

commit | commitdiff | tree

Michel Dänzer [Tue, 26 Aug 2014 09:21:50 +0000 (18:21 +0900)]

r600g,radeonsi: Always use GTT again for PIPE_USAGE_STREAM buffers

Putting those in VRAM can cause long pauses due to buffers being moved
into / out of VRAM.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84662
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Eric Anholt [Thu, 9 Oct 2014 07:36:03 +0000 (09:36 +0200)]

vc4: Optimize SF(ITOF(x)) -> SF(x).

This is a common production of st_glsl_to_tgsi, because CMP takes a float
argument.

commit | commitdiff | tree

Eric Anholt [Thu, 9 Oct 2014 07:32:10 +0000 (09:32 +0200)]

vc4: Add some optimization of FADD(FSUB(0, x)).

This is a common production of st_glsl_to_tgsi, which uses negate flags on
source arguments to handle subtraction.

commit | commitdiff | tree

Eric Anholt [Mon, 6 Oct 2014 22:47:38 +0000 (15:47 -0700)]

vc4: Mostly fix offset calculation for NPOT mipmap levels.

The non-base NPOT levels are stored as POT-aligned images. We get that
POT alignment by minifying the POT-aligned base level.

This means that level strides are also POT aligned, so we have to tell the
rendering mode config that our resource is larger than the actual
requested area.

Fixes the fbo-generatemipmap-formats NPOT cases. Regresses
depthstencil-render-miplevels 273 * -- the texture presentation now works
(where it was completely broken before), it looks like there's some
overflow of image bounds happening at the lower miplevels.

commit | commitdiff | tree

Eric Anholt [Tue, 30 Sep 2014 23:25:48 +0000 (16:25 -0700)]

vc4: Move the mirrored kernel code to a kernel/ directory.

Now this whole setup matches the kernel's file layout much more closely.

commit | commitdiff | tree

Eric Anholt [Tue, 30 Sep 2014 20:27:36 +0000 (13:27 -0700)]

vc4: Enable LIT lowering in TGSI instead of our own code.

This brings us the -128/128 clamping on the w component.

commit | commitdiff | tree

Eric Anholt [Wed, 8 Oct 2014 20:26:58 +0000 (22:26 +0200)]

vc4: Fix scalar math opcodes to replicate their result from the X channel.

Thanks to robclark for pointing out that I was probably failing to do this
when I reported a "bug" in his lowering code.

commit | commitdiff | tree

Chia-I Wu [Wed, 8 Oct 2014 19:30:17 +0000 (03:30 +0800)]

ilo: fix rectlist on GEN7+

It was broken by 343b014b57ecc5431477e090100e6a26edbda540.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>

commit | commitdiff | tree

Eric Anholt [Tue, 30 Sep 2014 23:24:26 +0000 (16:24 -0700)]

vc4: Add support for two-sided color.

It's fairly easy, thanks to Rob Clark's lowering code. Fixes
two-sided-lighting and 4 vertex-program-two-side testcases, while
regressing 8 testcases that involve enabling two-sided color while only
initializing one of the two colors in the VS. If you're enabling two
sided color, it's of course expected that you really do set up both
colors, so this is still an improvement (and when we set up a linker for
TGSI, we'll hopefully fix those 8 fails).

commit | commitdiff | tree

Eric Anholt [Tue, 30 Sep 2014 20:29:22 +0000 (13:29 -0700)]

vc4: Enable POW lowering in TGSI instead of our own code.

commit | commitdiff | tree

Eric Anholt [Tue, 30 Sep 2014 20:26:06 +0000 (13:26 -0700)]

vc4: Enable DP lowering in TGSI instead of our own code.

commit | commitdiff | tree

Eric Anholt [Tue, 30 Sep 2014 20:25:24 +0000 (13:25 -0700)]

vc4: Start using tgsi_lowering for opcodes we haven't supported before.

commit | commitdiff | tree

Eric Anholt [Tue, 30 Sep 2014 20:14:02 +0000 (13:14 -0700)]

gallium: Rename freedreno parts of tgsi_lowering.[ch].

Acked-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Eric Anholt [Tue, 30 Sep 2014 20:08:56 +0000 (13:08 -0700)]

gallium: Reformat tgsi_lowering.c for the normal style.

Acked-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Eric Anholt [Tue, 30 Sep 2014 20:07:23 +0000 (13:07 -0700)]

gallium: Copy fd_lowering.[ch] to tgsi_lowering.[ch] for code sharing.

Lots of drivers need to transform the weird instructions in TGSI into
reasonable scalar ops, and this code can make those translations
canonical.

Acked-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Eric Anholt [Fri, 3 Oct 2014 06:32:59 +0000 (23:32 -0700)]

vc4: Set unused raddr fields to QPU_R_NOP.

The simulator assertion fails if you have a write to a reg and then a read
(for example, in the NOP side of an instruction), even if the read isn't
used for anything. By setting unused raddrs to NOP, we avoid the problem
(since only the phsyical registers are tracked).

commit | commitdiff | tree

Eric Anholt [Fri, 3 Oct 2014 06:22:03 +0000 (23:22 -0700)]

vc4: Abstract out the field-merging logic for instructions.

I'm going to be doing the same logic for some more fields next.

commit | commitdiff | tree

Niels Ole Salscheider [Mon, 8 Sep 2014 18:10:31 +0000 (20:10 +0200)]

r600: Use DMA transfers in r600_copy_global_buffer

v2: Do not demote items that are already in the pool

Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>

commit | commitdiff | tree

Iago Toral Quiroga [Tue, 29 Jul 2014 09:36:31 +0000 (12:36 +0300)]

glsl: Optimize min/max expression trees

Original patch by Petri Latvala <petri.latvala@intel.com>:

Add an optimization pass that drops min/max expression operands that
can be proven to not contribute to the final result. The algorithm is
similar to alpha-beta pruning on a minmax search, from the field of
AI.

This optimization pass can optimize min/max expressions where operands
are min/max expressions. Such code can appear in shaders by itself, or
as the result of clamp() or AMD_shader_trinary_minmax functions.

This optimization pass improves the generated code for piglit's
AMD_shader_trinary_minmax tests as follows:

total instructions in shared programs: 75 -> 67 (-10.67%)
instructions in affected programs:     60 -> 52 (-13.33%)
GAINED:                                0
LOST:                                  0

All tests (max3, min3, mid3) improved.

A full shader-db run:

total instructions in shared programs: 4293603 -> 4293575 (-0.00%)
instructions in affected programs:     1188 -> 1160 (-2.36%)
GAINED:                                0
LOST:                                  0

Improvements happen in Guacamelee and Serious Sam 3. One shader from
Dungeon Defenders is hurt by shader-db metrics (26 -> 28), because of
dropping of a (constant float (0.00000)) operand, which was
compiled to a saturate modifier.

Version 2 by Iago Toral Quiroga <itoral@igalia.com>:

Changes from review feedback:
- Squashed various cosmetic changes sent by Matt Turner.
- Make less_all_components return an enum rather than setting a class member.
  (Suggested by Mat Turner). Also, renamed it to compare_components.
- Make less_all_components, smaller_constant and larger_constant static.
  (Suggested by Mat Turner)
- Change mixmax_range to call its limits "low" and "high" instead of
  "range[0]" and "range[1]". (Suggested by Connor Abbot).
- Use ir_builder swizzle helpers in swizzle_if_required(). (Suggested by
  Connor Abbot).
- Make the logic more clearer by rearrenging the code and commenting.
  (Suggested by Connor Abbot).
- Added comment to explain why we need to recurse twice. (Suggested by
  Connor Abbot).
- If we cannot prune an expression, do not return early. Instead, attempt
  to prune its children. (Suggested by Connor Abbot).

Other changes:
- Instead of having a global "valid" visitor member, let the various functions
  that can determine this status return a boolean and check for its value
  to decide what to do in each case. This is more flexible and allows to
  recurse into children of parents that could not be prunned due to invalid
  ranges (so related to the last bullet in the review feedback).
- Make sure we always check if a range is valid before working with it. Since
  any use of get_range, combine_range or range_intersection can invalidate
  a range we should check for this situation every time we use any of these
  functions.

Version 3 by Iago Toral Quiroga <itoral@igalia.com>:

Changes from review feedback:
- Now we can make get_range, combine_range and range_intersection static too
  (suggested by Connor Abbot).
- Do not return NULL when looking for the larger or greater constant into
  mixed vector constants. Instead, produce a new constant by doing a
  component-wise minmax. With this we can also remove of the validations when
  we call into these functions (suggested by Connor Abbot).
- Add a comment explaining the meaning of the baserange argument in
  prune_expression (suggested by Connor Abbot).

Other changes:
- Eliminate minmax expressions operating on constant vectors with mixed values
  by resolving them.

No piglit regressions observed with Version 3.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76861

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Tapani Pälli [Tue, 16 Sep 2014 17:18:41 +0000 (20:18 +0300)]

glsl: do not emit error for non written varyings on OpenGL ES

Patch fixes following test case from 'shaders-with-varyings' WebGL
conformance suite: "vertex shader with unused varying and fragment
shader with used varying must succeed"

v2: emit still a warning if the condition happens (Ian)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

commit | commitdiff | tree

Michel Dänzer [Mon, 6 Oct 2014 08:05:38 +0000 (17:05 +0900)]

radeonsi: Use dummy pixel shader if compilation of the real shader failed

Instead of crashing.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79155#c5
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Chia-I Wu [Mon, 6 Oct 2014 04:42:56 +0000 (12:42 +0800)]

ilo: let shaders determine surface counts

When a shader needs N surfaces, we should upload N surfaces and not depend on
how many are bound. This commit is larger than it should be because we did
not export how many surfaces a surface uses before.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>

commit | commitdiff | tree

Chia-I Wu [Sat, 4 Oct 2014 02:51:20 +0000 (10:51 +0800)]

ilo: let shaders determine sampler counts

When a shader needs N samplers, we should upload N samplers and not depend on
how many are bound.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>

commit | commitdiff | tree

Marek Olšák [Thu, 2 Oct 2014 14:36:51 +0000 (16:36 +0200)]

tgsi: change tgsi_shader_info::properties to a one-dimensional array

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
v2: fix svga too

commit | commitdiff | tree

Marek Olšák [Tue, 23 Sep 2014 17:42:28 +0000 (19:42 +0200)]

radeonsi: set number of userdata SGPRs of GS copy shader to 4

It only needs the constant buffer with clip planes and read-write resources
for the GS->VS ring and streamout. That's 2 pointers.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Tue, 30 Sep 2014 16:15:17 +0000 (18:15 +0200)]

radeonsi: pass the GS shader directly to si_generate_gs_copy_shader

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Tue, 30 Sep 2014 16:13:06 +0000 (18:13 +0200)]

radeonsi: set LLVMByValAttribute for all descriptor arrays

I hope this is correct.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Thu, 25 Sep 2014 14:47:55 +0000 (16:47 +0200)]

radeonsi: make the vertex shader key smaller

We only support 16 vertex attribs, not 32.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Tue, 23 Sep 2014 15:25:41 +0000 (17:25 +0200)]

radeonsi: don't flush shader caches when building PM4 shader states

This is a wrong place to flush caches to say the least.

I don't think we need to flush the instruction caches if we don't patch
shaders with DMA.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Tue, 30 Sep 2014 15:09:13 +0000 (17:09 +0200)]

radeonsi: remove interp_at_sample from the key, use TGSI_INTERPOLATE_LOC_SAMPLE

st/mesa has the same flag in its shader key, we don't need to do it
in the driver anymore.

Instead, use TGSI_INTERPOLATE_LOC_SAMPLE, which is what st/mesa sets.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Tue, 30 Sep 2014 14:55:36 +0000 (16:55 +0200)]

radeonsi: move geometry shader properties from si_shader to si_shader_selector

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Tue, 30 Sep 2014 14:25:18 +0000 (16:25 +0200)]

radeonsi: always compile shaders on demand

The first compiled shader is sometimes useless, because the key doesn't match
the key for the draw call where it's used.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Tue, 30 Sep 2014 14:11:59 +0000 (16:11 +0200)]

radeonsi: remove unused variable si_shader::gs_input_prim

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Tue, 30 Sep 2014 13:59:37 +0000 (15:59 +0200)]

tgsi: remove some not so useful variables from tgsi_shader_info

commit | commitdiff | tree

Marek Olšák [Tue, 30 Sep 2014 13:56:14 +0000 (15:56 +0200)]

radeonsi: get fs_write_all from tgsi_shader_info directly

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Tue, 30 Sep 2014 13:48:22 +0000 (15:48 +0200)]

tgsi: simplify shader properties in tgsi_shader_info

Use an array of properties indexed by TGSI_PROPERTY_* definitions.

commit | commitdiff | tree

Marek Olšák [Tue, 30 Sep 2014 13:12:09 +0000 (15:12 +0200)]

radeonsi: get tgsi_shader_info only once before compilation

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Wed, 24 Sep 2014 16:26:21 +0000 (18:26 +0200)]

gallium/util: add util_bitcount64

I'll need this in radeonsi.

v2: use __builtin_popcountll if available

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Marek Olšák [Fri, 5 Sep 2014 09:59:10 +0000 (11:59 +0200)]

radeonsi: fix CS tracing and remove excessive CS dumping

commit | commitdiff | tree

Ilia Mirkin [Sun, 28 Sep 2014 16:07:03 +0000 (12:07 -0400)]

gk110/ir: add dnz flag emission for fmul/fmad

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>

commit | commitdiff | tree

Ilia Mirkin [Sun, 28 Sep 2014 05:52:11 +0000 (01:52 -0400)]

gm107/ir: add dnz emission for fmul

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>

commit | commitdiff | tree

Brian Paul [Fri, 3 Oct 2014 15:55:34 +0000 (09:55 -0600)]

st/wgl: add WINAPI qualifiers on wgl function typedefs

Fixes a release build segfault when wglCreateContextAttribsARB()
calls the wglCreateContext() function.

Cc: "10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matthew McClure <mcclurem@vmware.com>

commit | commitdiff | tree

Rob Clark [Fri, 3 Oct 2014 16:48:31 +0000 (12:48 -0400)]

freedreno: query fixes

Fixes a few issues, including a potential empty-IB (which triggers gpu
hangs in piglit occlusion_query_meta_no_fragments)

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Fri, 3 Oct 2014 14:08:59 +0000 (10:08 -0400)]

freedreno/a3xx: handle VS only outputting BCOLOR

Possibly we should map the front color to black (zeroes). But not sure
there is a way to do that without generating a shader variant.

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Fri, 3 Oct 2014 14:02:31 +0000 (10:02 -0400)]

freedreno/ir3: fix lockups with lame FRAG shaders

Shaders like:

  FRAG
  PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1
  DCL IN[0], GENERIC[0], PERSPECTIVE
  DCL OUT[0], COLOR
  DCL SAMP[0]
  DCL TEMP[0], LOCAL
  IMM[0] FLT32 {    0.0000,     1.0000,     0.0000,     0.0000}
    0: TEX TEMP[0], IN[0].xyyy, SAMP[0], 2D
    1: MOV OUT[0], IMM[0].xyxx
    2: END

cause unhappyness.  They have an IN[], but once this is compiled the
useless TEX instruction goes away.  Leaving a varying that is never
fetched, which makes the hw unhappy.

In the process fix a signed vs unsigned compare.  If the vertex shader
has max_reg=-1, MAX2() vs an unsigned would not give the desired result.

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Matt Turner [Fri, 3 Oct 2014 17:01:54 +0000 (10:01 -0700)]

i965/compaction: Disable compaction on SNB temporarily.

Will investigate after XDC.

commit | commitdiff | tree

Matt Turner [Fri, 3 Oct 2014 16:58:41 +0000 (09:58 -0700)]

Revert "i965: Emit ELSE/ENDIF JIP with type D on Gen 7."

This reverts commit 54e30dbf4db437748509d1319c3f6e4185f76c69.

Will investigate after XDC.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84557

commit | commitdiff | tree

Matt Turner [Wed, 1 Oct 2014 06:18:34 +0000 (23:18 -0700)]

i965/fs: Remove dead generate_rep_fb_write prototype.

Added in commit f9dc7aab.

commit | commitdiff | tree

Brian Paul [Thu, 2 Oct 2014 15:36:54 +0000 (09:36 -0600)]

mesa: fix spurious wglGetProcAddress / GL_INVALID_OPERATION error

On Windows, the Piglit primitive-restart test was failing a
glGetError()==0 assertion when it was run w/out any command line
arguments. Piglit's all.py script only runs primitive-restart
with arguments so this case isn't normally hit during a full
piglit run.

The basic problem is Microsoft's opengl32.dll calls glFlush
from wglGetProcAddress() and Piglit uses wglGetProcAddress() to
resolve glPrimitiveRestartNV() which is called inside glBegin/End.
See comments in the code for more info.

Plus, improve the comments for _mesa_alloc_dispatch_table().

Cc: <mesa-stable@lists.freedesktop.org>
Acked-by: Sinclair Yeh <syeh@vmware.com>

commit | commitdiff | tree

Ilia Mirkin [Wed, 24 Sep 2014 21:42:03 +0000 (17:42 -0400)]

freedreno/ir3: add TXF support

Still failing a bunch of the fairly picky texelFetch tests, but the
1D(Array) ones are full passes.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Ilia Mirkin [Sat, 27 Sep 2014 14:50:40 +0000 (10:50 -0400)]

freedreno/ir3: add TXD support and expose ARB_shader_texture_lod

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Ilia Mirkin [Sat, 27 Sep 2014 06:52:42 +0000 (02:52 -0400)]

freedreno/ir3: add texture offset support

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Ilia Mirkin [Wed, 1 Oct 2014 00:02:37 +0000 (20:02 -0400)]

freedreno/ir3: shadow comes before array

Experimentally, this makes *ArrayShadow tex-miplevel-selection tests
pass.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Ilia Mirkin [Sun, 28 Sep 2014 23:37:27 +0000 (19:37 -0400)]

freedreno/ir3: make TXQ return integers, not floats

We're still doing something wrong for array textures.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Ilia Mirkin [Wed, 1 Oct 2014 05:13:38 +0000 (01:13 -0400)]

freedreno/ir3: add UMAD support

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Ilia Mirkin [Mon, 29 Sep 2014 02:00:34 +0000 (22:00 -0400)]

freedreno/ir3: add ISSG support

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Ilia Mirkin [Wed, 1 Oct 2014 05:03:31 +0000 (01:03 -0400)]

freedreno/ir3: add MOD support

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Ilia Mirkin [Mon, 29 Sep 2014 01:05:05 +0000 (21:05 -0400)]

freedreno/ir3: add UMOD support, based on UDIV

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Ilia Mirkin [Fri, 12 Sep 2014 03:15:11 +0000 (23:15 -0400)]

freedreno/ir3: add IDIV/UDIV support

Logic shamelessly copied from nv50 lowering pass.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Michel Dänzer [Thu, 2 Oct 2014 07:00:26 +0000 (16:00 +0900)]

radeonsi: Clear sampler view flags when binding a buffer

Fixes assertion failure while running the Unreal Engine 4 Elemental demo:

.../si_blit.c:322:si_decompress_color_textures: Assertion `tex->cmask.size || tex->fmask.size' failed.

Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Eric Anholt [Thu, 2 Oct 2014 21:14:48 +0000 (14:14 -0700)]

vc4: Add support for framebuffer sRGB encoding.

commit | commitdiff | tree

Eric Anholt [Thu, 2 Oct 2014 21:01:29 +0000 (14:01 -0700)]

vc4: Add support for sampling from sRGB.

This isn't perfect -- the filtering is happening on the srgb values, and
we're decoding afterwards, which is not what you want. I think that's the
cause of some additional texwrap(GL_CLAMP, LINEAR) failures, though many
other texwrap tests on srgb start to pass since unfiltered values come out
correct.

commit | commitdiff | tree

Ilia Mirkin [Wed, 1 Oct 2014 03:27:25 +0000 (23:27 -0400)]

freedreno/ir3: avoid fan-in sources referring to same instruction

Since the RA has to be done s.t. each one gets its own (adjacent)
register, it would complicate matters if instructions were allowed to be
repeated. This enables copy-propagation use in situations where
previously that might have happened.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Wed, 1 Oct 2014 15:28:17 +0000 (11:28 -0400)]

freedreno/a3xx: emit all immediates in one shot

Makes the command stream a bit tighter when there are lots of
immediates.

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Ilia Mirkin [Thu, 2 Oct 2014 07:39:05 +0000 (03:39 -0400)]

freedreno: instanced drawing/compute not yet supported

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Dave Airlie [Tue, 30 Sep 2014 23:22:13 +0000 (09:22 +1000)]

mesa: fix GetTexImage for 1D array depth textures

While running piglit in virgl, I hit an assert in intel driver.

"qemu-system-x86_64: intel_tex.c:219: intel_map_texture_image: Assertion `tex_image->TexObject->Target != 0x8C18 || h == 1' failed."

Thanks to Eric and Ken for pointing me in the right direction,

Fix the get_tex_depth to do the same fixup as get_tex_rgba does
for 1D array textures.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Tomasz Figa [Sat, 27 Sep 2014 14:20:01 +0000 (16:20 +0200)]

st/mesa: Fix paths used in Android builds

With current makefiles the build fails because source and build paths
are generated incorrectly. With Android build system the top_srcdir and
top_builddir variables are undefined and all paths are relative to where
Android.mk is located. This ends up with path likes
external/mesa/src/mesa/src/mesa/ for both source and build paths, which
are obviously wrong.

This patch fixes this by overriding resulting SRCDIR and BUILDDIR
variables with empty string, so that paths end up being relative to
Android.mk file again. Appending correct build path to generated files
is already done in Android.gen.mk.

Signed-off-by: Tomasz Figa <tomasz.figa@gmail.com>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

commit | commitdiff | tree

Tomasz Figa [Sat, 27 Sep 2014 14:20:00 +0000 (16:20 +0200)]

st/mesa: Generate format_info.c in Android builds

Current Android makefiles lack generation of format_info.c, which is
a dependency of main/format.c. This patch adds necessary code to
Android.gen.mk.

Signed-off-by: Tomasz Figa <tomasz.figa@gmail.com>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

commit | commitdiff | tree

Tomasz Figa [Sat, 27 Sep 2014 14:19:59 +0000 (16:19 +0200)]

util: Include in Android builds

This patch fixes Android build failures by including src/util directory
in compilation. Files inside of this directory are compiled into
libmesa_util static library and linked with resulting libGLES_mesa.

Signed-off-by: Tomasz Figa <tomasz.figa@gmail.com>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Thu, 2 Oct 2014 23:04:57 +0000 (16:04 -0700)]

i965/fs: Use the correct base_mrf for spilling pairs in SIMD8

Before, we were hard-coding the base_mrf based on dispatch width not number
of registers spilled at a time. This caused us to emit instructions with a
base_mrf or 14 and a mlen of 3 so we used the magical non-existant m16
register. This fixes the problem.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 1 Oct 2014 17:54:59 +0000 (10:54 -0700)]

i965/fs: Add a MAX_GRF_SIZE define and use it various places

Previously, we had a MAX_SAMPLER_MESSAGE_SIZE which we used instead.
However, some FB write messages can validly be longer than this so we need
something different. Since MAX_SAMPLER_MESSAGE_SIZE is validly useful on
its own, we leave it alone and add a new MAX_GRF_SIZE that's big enough for
FB writes.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84539
Reviewed-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 1 Oct 2014 17:46:48 +0000 (10:46 -0700)]

i965/fs: Use the actual regsister width in brw_reg_from_fs_reg

This fixes a bug where 1-wide operations don't properly translate down to
1-wide instructions.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 1 Oct 2014 17:27:24 +0000 (10:27 -0700)]

i965/fs_fp: Use null_reg from fs_visitor instead of rolling our own

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84529
Reviewed-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Rob Clark [Wed, 1 Oct 2014 19:26:26 +0000 (15:26 -0400)]

freedreno/a3xx: handle large shader program sizes

Above a certain limit use CACHE mode instead of BUFFER mode. This
should solve gpu hangs with large shader programs.

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Rob Clark [Wed, 1 Oct 2014 18:57:34 +0000 (14:57 -0400)]

freedreno: update generated headers

Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Ilia Mirkin [Wed, 1 Oct 2014 04:26:03 +0000 (00:26 -0400)]

freedreno: dual-source render targets are not supported

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>

commit | commitdiff | tree

Ilia Mirkin [Wed, 1 Oct 2014 23:43:38 +0000 (19:43 -0400)]

gallium/hud: use u_sampler_view_default_template helper

The existing code was not setting several fields, most importantly the
target, which is required on nv50/nvc0.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Iago Toral Quiroga [Wed, 1 Oct 2014 10:12:38 +0000 (12:12 +0200)]

glsl: Fix memory leak in builtin_builder::_image_prototype.

in_var calls the ir_variable constructor, which dups the variable name.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Tapani Pälli [Tue, 30 Sep 2014 07:28:26 +0000 (10:28 +0300)]

mesa: relax draw api validation on ES2

Patch fixes failing test in WebGL conformance test
'point-no-attributes' when running Chrome on OpenGL ES.
(Shader program may draw points using constant data in shader.)

No Piglit regressions.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Ilia Mirkin [Tue, 30 Sep 2014 04:12:40 +0000 (00:12 -0400)]

glsl: make consistent use of DECLARE_RALLOC_CXX_OPERATORS

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Eric Anholt [Wed, 1 Oct 2014 18:58:22 +0000 (11:58 -0700)]

vc4: Fix the mapping of the minification filter to HW values.

They're actually as documented in the HW specs and the GL mipmapping enums
order. Fixes fbo-generatemipmap-filtering , and some other tests where we
were off by a few bits due to unexpected linear filtering.

commit | commitdiff | tree

Eric Anholt [Wed, 1 Oct 2014 17:58:02 +0000 (10:58 -0700)]

vc4: Make the last static array in vc4_program.c dynamically sized.

commit | commitdiff | tree

Eric Anholt [Tue, 30 Sep 2014 23:10:09 +0000 (16:10 -0700)]

vc4: Fix some broken indentation.

commit | commitdiff | tree

Eric Anholt [Tue, 30 Sep 2014 23:08:23 +0000 (16:08 -0700)]

vc4: Add support for the FACE semantic.

Fixes glsl-fs-frontfacing.

commit | commitdiff | tree

Eric Anholt [Tue, 30 Sep 2014 21:19:25 +0000 (14:19 -0700)]

vc4: Add support for TGSI_OPCODE_CLAMP.

This will be used by the shared LIT lowering code.

commit | commitdiff | tree

Eric Anholt [Tue, 30 Sep 2014 23:26:51 +0000 (16:26 -0700)]

vc4: Fix compiler warning

commit | commitdiff | tree

Anuj Phogat [Wed, 1 Oct 2014 22:24:27 +0000 (15:24 -0700)]

meta: Fix make check failures in setup_glsl_msaa_blit_scaled_shader()

introduced by commit 68ee950.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reported-by: Mark Janes <mark.a.janes@intel.com>

commit | commitdiff | tree

Brian Paul [Wed, 1 Oct 2014 15:03:13 +0000 (09:03 -0600)]

mesa: fix _mesa_alloc_dispatch_table() declaration

Insert 'void' parameter to match declaration in api_exec.h. Trivial.

commit | commitdiff | tree

Roland Scheidegger [Wed, 1 Oct 2014 21:14:46 +0000 (23:14 +0200)]

meta: (trivial) remove accidental double semicolon

commit | commitdiff | tree

Anuj Phogat [Thu, 4 Sep 2014 20:49:04 +0000 (13:49 -0700)]

i965: Enable EXT_framebuffer_multisample_blit_scaled for gen8

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Anuj Phogat [Fri, 5 Sep 2014 19:19:22 +0000 (12:19 -0700)]

meta: Implement ext_framebuffer_multisample_blit_scaled extension

Extension enables doing a multisample buffer resolve and buffer
scaling using a single glBlitFrameBuffer() call. Currently, we
have this extension implemented in BLORP which is only used by
SNB and IVB. This patch implements the extension in meta path
which makes it available to Broadwell.

Implementation features:
- Supports scaled resolves of 2X, 4X and 8X multisample buffers.

- Avoids unnecessary shader compilations by storing the pre compiled
   shaders for each supported sample count.

- Uses bilinear filtering for both GL_SCALED_RESOLVE_FASTEST_EXT and
   GL_SCALED_RESOLVE_NICEST_EXT filter options. This is an allowed
   behavior in the extension's spec.

- I tried doing bicubic filtering for GL_SCALED_RESOLVE_NICEST_EXT
   filter. It made the edges in the image look little smoother but
   the image gets blurred causing no overall quality improvement.
   For now I have dropped the idea of doing different filtering for
   nicest filter.

V2:
- Minor changes to simplify the fragment shader.
- Refactor the code to move i965 specific sample_map computation out
   of Meta. We now use ctx->Const.SampleMap{2,4,8}x variables initialized
   by the driver.
- Use a simple msaa resolve shader for scaled resolves with scaling
   factor = 1.0.

V3:
- Make changes to create a string out of ctx->Const.SampleMap{2,4,8}x
   variables and use it in fragment shader.

V4:
- Make changes to use uint8_t type ctx->Const.SampleMap{2,4,8}x
   variables.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Anuj Phogat [Tue, 23 Sep 2014 18:58:02 +0000 (11:58 -0700)]

i965: Initialize the SampleMap{2,4,8}x variables

with values specific to Intel hardware.

V2: Define and use gen6_get_sample_map() function to initialize
the variables.

V3: Change the function name to gen6_set_sample_maps() and use
memcpy() to fill in the data.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Anuj Phogat [Tue, 23 Sep 2014 18:56:54 +0000 (11:56 -0700)]

mesa: Add new variables in gl_context to store sample layout

SampleMap{2,4,8}x variables are used in later patches to implement
EXT_framebuffer_multisample_blit_scaled extension.

V2: Use integer array instead of a string.
Bump up the comment.

V3: Use uint8_t type array.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Leo Liu [Thu, 18 Sep 2014 16:21:58 +0000 (12:21 -0400)]

st/va: implement vlVa(Query|Create|Get|Put|Destroy)Image

This patch implements functions for images support,
which basically supports copy data between video
surface and user buffers, in this case supports
SW decode, and other video output

v2: fix buffer size for odd-sized image case
expose I420 format as well
v3: fix YUV 4:2:2 format data buffer size
cleanup I420 format exposure

Signed-off-by: Leo Liu <leo.liu@amd.com>

commit | commitdiff | tree

Christian König [Thu, 18 Sep 2014 15:57:46 +0000 (11:57 -0400)]

st/va: implement Picture functions for mpeg2 h264 and vc1

This patch implements codec for mpeg2 h264 and vc1,
populates codec parameters and pass them to HW driver.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Leo Liu <leo.liu@amd.com>

commit | commitdiff | tree

Christian König [Fri, 4 Jul 2014 16:44:36 +0000 (12:44 -0400)]

st/va: implement Context Surface and Buffer

This patch implements context managements, relate it HW driver,
functions for video surface managements, and functions for
application data memory buffer managements.

implemented functions:
vlVa(Create|Destroy)Context
vlVa(Create|Destroy|Put)Surfaces
vlVa(Create|Destroy)Buffer

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Leo Liu <leo.liu@amd.com>

commit | commitdiff | tree

Christian König [Tue, 28 May 2013 16:02:58 +0000 (18:02 +0200)]

st/va: implement vlVa(Create|Destroy|Query|Get)Config

This patch is for application to query configuration,
such as profiles, entrypoints, and attributes

v2: fix missing profile with query

Signed-off-by: Michael Varga <michael.varga@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Leo Liu <leo.liu@amd.com>