mesa.git
7 years agoegl/x11: factor out dri2_get_xcb_connection()
Emil Velikov [Mon, 21 Nov 2016 13:46:50 +0000 (13:46 +0000)]
egl/x11: factor out dri2_get_xcb_connection()

Identical throughout dri2, dri3 and drisw. Next patch will add more
common code, so rather than duplicating it factor out the function.

Note: this also sets eglError on failure. Something that's quite
inconsistent throughout the codebase.

v2: Call xcb_disconnect() on error (Eric)

Note: use xcb_disconnect() even in the xcb_connection_has_error() case
as per the manual:
... memory will not be freed until xcb_disconnect...

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
7 years agomesa/glsl: remove unused uses_builtin_functions field
Timothy Arceri [Tue, 22 Nov 2016 06:59:41 +0000 (17:59 +1100)]
mesa/glsl: remove unused uses_builtin_functions field

This has been unused since 943b69cddd

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
7 years agoi965: Use NIR-based clip/cull lowering for OpenGL as well.
Kenneth Graunke [Mon, 17 Oct 2016 21:23:10 +0000 (14:23 -0700)]
i965: Use NIR-based clip/cull lowering for OpenGL as well.

The old approach works fine, and this approach isn't necessarily better.
But it at least has the advantage that Vulkan and GL use the same
approach.  I originally wrote it to gain additional testing for the
new paths.

shader-db statistics show 0 instruction count changes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv: Enable clip and cull distance support.
Kenneth Graunke [Tue, 4 Oct 2016 03:44:38 +0000 (20:44 -0700)]
anv: Enable clip and cull distance support.

Everything is now in place, and we appear to pass the tests on Gen7+.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoi965/vec4: Handle component qualifiers on non-generic varyings.
Kenneth Graunke [Mon, 17 Oct 2016 18:14:10 +0000 (11:14 -0700)]
i965/vec4: Handle component qualifiers on non-generic varyings.

ARB_enhanced_layouts only requires component qualifier support for
generic varyings, so this is all the vec4 backend knew how to handle.

This patch extends the backend to handle it for all varyings, so we
can use store_output intrinsics with a component set for things like
clip/cull distances.  We may want to use that for other VUE header
fields in the future as well.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
7 years agoi965/fs: Handle compact outputs.
Kenneth Graunke [Tue, 4 Oct 2016 08:59:33 +0000 (01:59 -0700)]
i965/fs: Handle compact outputs.

We need to calculate the number of vec4 slots correctly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agospirv: Silence unsupported capability warnings for Clip/CullDistance.
Kenneth Graunke [Tue, 4 Oct 2016 06:46:37 +0000 (23:46 -0700)]
spirv: Silence unsupported capability warnings for Clip/CullDistance.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv: Set clip/cull distances fields in packets.
Kenneth Graunke [Tue, 4 Oct 2016 06:44:07 +0000 (23:44 -0700)]
anv: Set clip/cull distances fields in packets.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv: Combine ClipDistance and CullDistance arrays.
Kenneth Graunke [Tue, 4 Oct 2016 03:42:42 +0000 (20:42 -0700)]
anv: Combine ClipDistance and CullDistance arrays.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agonir: add a pass to compact clip/cull distances.
Kenneth Graunke [Tue, 4 Oct 2016 05:37:40 +0000 (22:37 -0700)]
nir: add a pass to compact clip/cull distances.

v2: Use nir_is_per_vertex_io() rather than is_arrays_of_arrays().

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agonir: Add a "compact array" flag and IO lowering code.
Kenneth Graunke [Tue, 4 Oct 2016 03:32:22 +0000 (20:32 -0700)]
nir: Add a "compact array" flag and IO lowering code.

Certain built-in arrays, such as gl_ClipDistance[], gl_CullDistance[],
gl_TessLevelInner[], and gl_TessLevelOuter[] are specified as scalar
arrays.  Normal scalar arrays are sparse - each array element usually
occupies a whole vec4 slot.  However, most hardware assumes these
built-in arrays are tightly packed.

The new var->data.compact flag indicates that a scalar array should
be tightly packed, so a float[4] array would take up a single vec4
slot, and a float[8] array would take up two slots.

They are still arrays, not vec4s, however.  nir_lower_io will generate
intrinsics using ARB_enhanced_layouts style component qualifiers.

v2: Add nir_validate code to enforce type restrictions.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoradv: add support for shader stats dump
Dave Airlie [Tue, 22 Nov 2016 04:17:49 +0000 (04:17 +0000)]
radv: add support for shader stats dump

I've started working on a shader-db alike for Vulkan,
it's based on vktrace and it records pipelines, this
adds support to dump the shader stats exactly like
radeonsi does, so I can reuse the shader-db scripts it
uses.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: fix sample id loading
Dave Airlie [Mon, 21 Nov 2016 01:12:39 +0000 (01:12 +0000)]
radv: fix sample id loading

The sample id is packed into bits 8-12, so adjust
things properly.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv/ac: add implementation of load_sample_pos intrinsic.
Dave Airlie [Wed, 16 Nov 2016 03:54:22 +0000 (03:54 +0000)]
radv/ac: add implementation of load_sample_pos intrinsic.

This fixes a bunch of crashes in CTS tests looking for this.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv/ac: cleanup ddxy emission
Dave Airlie [Sun, 20 Nov 2016 23:56:45 +0000 (23:56 +0000)]
radv/ac: cleanup ddxy emission

This cleans up the ddxy emission along the same lines as
radeonsi. It also means we don't use LDS on VI chips we
use the dspermute interface, it also removes some duplicated
code.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv/meta: cleanup resolve vertex state emission
Dave Airlie [Mon, 21 Nov 2016 03:31:09 +0000 (03:31 +0000)]
radv/meta: cleanup resolve vertex state emission

For the hw resolve there is no need to emit any sort
of texture coordinates, so drop them all in the meta path.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: Incorporate GPU family into cache UUID.
Bas Nieuwenhuizen [Mon, 21 Nov 2016 23:39:50 +0000 (00:39 +0100)]
radv: Incorporate GPU family into cache UUID.

Invalidates the cache when someone switches cards.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
7 years agoradv: Use library mtime for cache UUID.
Bas Nieuwenhuizen [Mon, 21 Nov 2016 23:19:30 +0000 (00:19 +0100)]
radv: Use library mtime for cache UUID.

We want to also invalidate the cache when LLVM gets changed. As the
specific LLVM revision is not fixed at build time, we will need to
check at runtime. Computing a checksum for LLVM is going to be very
expensive, so just use the mtime.

Tested on my computer that the returned DSO for the LLVM symbol is
actually the LLVM DSO.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
7 years agoradv: Store UUID in physical device.
Bas Nieuwenhuizen [Mon, 21 Nov 2016 23:31:44 +0000 (00:31 +0100)]
radv: Store UUID in physical device.

No sense in repeatedly determining it. Also, it might be dependent
on the device as shaders get compiled differently for SI/CIK/VI etc.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
7 years agoglsl: fix NULL check
Timothy Arceri [Tue, 22 Nov 2016 02:19:33 +0000 (13:19 +1100)]
glsl: fix NULL check

Fixes copy and paste error in 9d96d3803ab

7 years agoswr: calculate viewport width/height based on the scale
Ilia Mirkin [Sat, 19 Nov 2016 01:19:24 +0000 (20:19 -0500)]
swr: calculate viewport width/height based on the scale

The former calculations were for min/max y. The width/height don't take
translate into account.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
7 years agoswr: don't claim to allow setting layer/viewport from VS
Ilia Mirkin [Sun, 20 Nov 2016 18:13:12 +0000 (13:13 -0500)]
swr: don't claim to allow setting layer/viewport from VS

This may ultimately be possible to support, but for now it's not hooked
up and the swr core only supports this output from GS.

This normally wouldn't matter, but we lie about supporting GL 3.2, and
also the blitter and st/mesa will make use of this functionality if
claimed.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
7 years agoswr: allocate all scratch space in one go for vertex buffers
Ilia Mirkin [Sun, 20 Nov 2016 21:34:59 +0000 (16:34 -0500)]
swr: allocate all scratch space in one go for vertex buffers

Multiple buffers may reference client arrays. When this happens, we
might reach for scratch space multiple times, which could cause later
arrays to invalidate the pointers allocated for the earlier ones.

This fixes copyteximage 2D_ARRAY.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
7 years agoswr: call swr_update_derived unconditionally when drawing/clearing
Ilia Mirkin [Sun, 20 Nov 2016 05:04:42 +0000 (00:04 -0500)]
swr: call swr_update_derived unconditionally when drawing/clearing

Currently a sequence like draw/map/draw/map will cause the second map to
not wait for the second draw. This is because the first map will clear
the resource business bit, and the second draw won't reset it since no
state has changed.

swr_update_derived does a tiny bit of extra work, including updating the
SWR_BACKEND_STATE as well as waiting for prending fences. If that's a
problem, we could call swr_update_resource_status directly from
draw/clear handlers.

Fixes clearbuffer-stencil, clearbuffer-depth, clearbuffer-depth-stencil,
and clearbuffer-display-lists.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
7 years agoswr: [rasterizer memory] minify texture width before alignment
Ilia Mirkin [Fri, 18 Nov 2016 03:40:29 +0000 (22:40 -0500)]
swr: [rasterizer memory] minify texture width before alignment

The minification should happen before alignment, not after. See similar
logic on ComputeLODOffsetY. The current logic requires unnecessarily
large textures when there's an initial NPOT size.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
7 years agoswr: [rasterizer memory] minify original sizes for block formats
Ilia Mirkin [Sat, 12 Nov 2016 08:01:15 +0000 (03:01 -0500)]
swr: [rasterizer memory] minify original sizes for block formats

There's no guarantee that mip width/height will be a multiple of the
compressed block size. Doing a divide by the block size first yields
different results than GL expects, so we do the divide at the end.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
7 years agoradeonsi: remove all varyings for depth-only rendering or rasterization off
Marek Olšák [Tue, 15 Nov 2016 20:15:55 +0000 (21:15 +0100)]
radeonsi: remove all varyings for depth-only rendering or rasterization off

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: eliminate VS outputs that aren't used by PS at runtime
Marek Olšák [Mon, 14 Nov 2016 08:09:51 +0000 (09:09 +0100)]
radeonsi: eliminate VS outputs that aren't used by PS at runtime

A past commit added the ability to compile "optimized" shader variants
asynchronously (not stalling the app).

This commit builds upon that and adds what is basically a runtime shader
linker. If a VS output isn't used by the currently-bound PS, a new VS
compilation is started without that output. The new shader variant
is used when it's ready.

All apps using separate shader objects I've seen had unused VS outputs.

Eliminating unused/useless VS outputs also eliminates the corresponding
vertex attribute loads.

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: record information about all written and read varyings
Marek Olšák [Mon, 14 Nov 2016 06:56:57 +0000 (07:56 +0100)]
radeonsi: record information about all written and read varyings

It's just tgsi_shader_info with DEFAULT_VAL varyings removed.

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: make si_shader_io_get_unique_index stricter
Marek Olšák [Sun, 13 Nov 2016 18:54:13 +0000 (19:54 +0100)]
radeonsi: make si_shader_io_get_unique_index stricter

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: don't export ClipVertex and ClipDistance[] if clipping is disabled
Marek Olšák [Sun, 13 Nov 2016 17:41:43 +0000 (18:41 +0100)]
radeonsi: don't export ClipVertex and ClipDistance[] if clipping is disabled

This is the first user of optimized monolithic shader variants.

Cull distances can't be disabled by states.

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: add infrastr. for compiling optimized shader variants asynchronously
Marek Olšák [Thu, 13 Oct 2016 10:18:53 +0000 (12:18 +0200)]
radeonsi: add infrastr. for compiling optimized shader variants asynchronously

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: don't set vs.epilog.export_prim_id if TES is bound
Marek Olšák [Sun, 13 Nov 2016 16:30:54 +0000 (17:30 +0100)]
radeonsi: don't set vs.epilog.export_prim_id if TES is bound

there is no VS epilog in this case

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: simplify checking for monolithic compilation
Marek Olšák [Sun, 13 Nov 2016 18:21:46 +0000 (19:21 +0100)]
radeonsi: simplify checking for monolithic compilation

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: print all flags in si_dump_shader_key
Marek Olšák [Mon, 14 Nov 2016 00:53:24 +0000 (01:53 +0100)]
radeonsi: print all flags in si_dump_shader_key

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: split the shader key into 3 logical parts
Marek Olšák [Sun, 13 Nov 2016 02:17:46 +0000 (03:17 +0100)]
radeonsi: split the shader key into 3 logical parts

key->part.*: prolog and epilog flags only
key->as_{ls,es}: special flags
key->mono.*: flags for monolithic compilation only

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: fix culling if clip & cull distances are used at the same time
Marek Olšák [Sun, 13 Nov 2016 17:12:36 +0000 (18:12 +0100)]
radeonsi: fix culling if clip & cull distances are used at the same time

Fixed piglits:
- arb_cull_distance/clip-cull-3
- arb_cull_distance/clip-cull-4

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: clean up si_emit_clip_regs
Marek Olšák [Sun, 13 Nov 2016 16:51:41 +0000 (17:51 +0100)]
radeonsi: clean up si_emit_clip_regs

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: assume that a VS without POSITION is LS
Marek Olšák [Mon, 14 Nov 2016 01:03:28 +0000 (02:03 +0100)]
radeonsi: assume that a VS without POSITION is LS

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agotgsi/scan: record if a shader writes the position output
Marek Olšák [Mon, 14 Nov 2016 01:01:34 +0000 (02:01 +0100)]
tgsi/scan: record if a shader writes the position output

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agotgsi/scan: use a big switch for scanning outputs
Marek Olšák [Mon, 14 Nov 2016 00:59:42 +0000 (01:59 +0100)]
tgsi/scan: use a big switch for scanning outputs

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: decrease the number of texture slots to 24
Marek Olšák [Sun, 6 Nov 2016 20:49:29 +0000 (21:49 +0100)]
radeonsi: decrease the number of texture slots to 24

Company Of Heroes 2 needs only 24.

This saves 512 bytes of CE RAM per shader stage.

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: fast exit si_emit_derived_tess_state early
Marek Olšák [Fri, 11 Nov 2016 21:36:17 +0000 (22:36 +0100)]
radeonsi: fast exit si_emit_derived_tess_state early

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agowinsys/amdgpu: set addrlib flag opt4Space
Marek Olšák [Fri, 11 Nov 2016 20:19:34 +0000 (21:19 +0100)]
winsys/amdgpu: set addrlib flag opt4Space

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: check for !is_linear in do_hardware_msaa_resolve
Marek Olšák [Fri, 11 Nov 2016 20:15:54 +0000 (21:15 +0100)]
radeonsi: check for !is_linear in do_hardware_msaa_resolve

We don't want opt4Space here.

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agogallium/radeon: add RADEON_SURF_OPTIMIZE_FOR_SPACE
Marek Olšák [Fri, 11 Nov 2016 20:14:03 +0000 (21:14 +0100)]
gallium/radeon: add RADEON_SURF_OPTIMIZE_FOR_SPACE

FORCE_TILING should disable it. It has no effect now, but that may change
soon.

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: Add missing error-checking to si_create_compute_state (v2)
Mun Gwan-gyeong [Mon, 21 Nov 2016 14:20:43 +0000 (23:20 +0900)]
radeonsi: Add missing error-checking to si_create_compute_state (v2)

When the uploading of shader fails on si_shader_binary_upload(),
it returns -ENOMEM. We should handle si_shader_binary_upload() failure path
on si_create_compute_state().

CID 1394027

v2: Fixes from Edward O'Callaghan's review
 a) Update explicitly return value check with "si_shader_binary_upload() < 0"
 b) Update commit message.

Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
7 years agodraw: drop some overflow computations
Roland Scheidegger [Sun, 13 Nov 2016 15:33:37 +0000 (16:33 +0100)]
draw: drop some overflow computations

It turns out that noone actually cares if the address computations overflow,
be it the stride mul or the offset adds.
Wrap around seems to be explicitly permitted even by some other API (which
is a _very_ surprising result, as these overflow computations were added just
for that and made some tests pass at that time - I suspect some later fixes
fixed the actual root cause...). So the requirements in that other api were
actually sane there all along after all...
Still need to make sure the computed buffer size needed is valid, of course.
This ditches the shiny new widening mul from these codepaths, ah well...

And now that I really understand this, change the fishy min limiting
indices to what it really should have done. Which is simply to prevent
fetching more values than valid for the last loop iteration. (This makes
the code path in the loop minimally more complex for the non-indexed case
as we have to skip the optimization combining two adds. I think it should
be safe to skip this actually there, but I don't care much about this
especially since skipping that optimization actually makes the code easier
to read elsewhere.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
7 years agodraw: simplify fetch some more
Roland Scheidegger [Sun, 13 Nov 2016 15:33:20 +0000 (16:33 +0100)]
draw: simplify fetch some more

Don't keep the ofbit. This is just a minor simplification, just adjust
the buffer size so that there will always be an overflow if buffers aren't
valid to fetch from.
Also, get rid of control flow from the instanced path too. Not worried about
performance, but it's simpler and keeps the code more similar to ordinary
fetch.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
7 years agodraw: unify linear and elts draw jit functions
Roland Scheidegger [Sun, 13 Nov 2016 15:32:24 +0000 (16:32 +0100)]
draw: unify linear and elts draw jit functions

The code for elts and linear paths was nearly 100% identical by now - with
the elts path simply having some additional gather for the elements in the
main loop (with some additional small differences before the main loop).

Hence nuke the separate functions and decide this at jit shader execution
time (simply based on the presence of the elts pointer).

Some analysis shows that the generated vs jit functions seem to be just very
minimally more complex than the former elts functions, and almost none of the
additional complexity is in the main loop (basically just the branch logic
for the branch fetching the actual indices).
Compared to linear, the codesize of the function is of course a bit larger,
however the actual executed code in the main loop appears to be near 100%
identical (the additional code looking up indices is skipped as expected).

So, I would not expect a (meaningful) performance difference with the
generated code, neither with elts nor linear, this does however roughly
half the compilation time (the compiled shaders should also use only half
the memory of course).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
7 years agodraw: use same argument order for jit draw linear / elts functions
Roland Scheidegger [Sun, 13 Nov 2016 15:31:57 +0000 (16:31 +0100)]
draw: use same argument order for jit draw linear / elts functions

This is a bit simpler. Mostly to make it easier to unify the paths later...

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
7 years agodraw: drop unnecessary index overflow handling from vsplit code
Roland Scheidegger [Sun, 13 Nov 2016 16:17:25 +0000 (17:17 +0100)]
draw: drop unnecessary index overflow handling from vsplit code

This was kind of strange, since it replaced indices which were only
overflowing due to bias with MAX_UINT. This would cause an overflow later
in the shader, except if stride was 0, however the vertex id would be
essentially random then (-1 + eltBias). No test cared about it, though.
So, drop this and just use ordinary int arithmetic wraparound as usual.
This is much simpler to understand and the results are "more correct" or
at least more consistent (vertex id as well as actual fetch results just
correspond to wrapped around arithmetic).
There's only one catch, it is now possible to hit the cache initialization
value also with ushort and ubyte elts path (this wouldn't be an issue if
we'd simply handle the eltBias itself later in the shader). Hence, we need
to make sure the cache logic doesn't think this element has already been
emitted when it has not (I believe some seriously bad things could happen
otherwise). So, borrow the logic which handled this from the uint case, but
not before fixing it up...

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
7 years agodraw: simplify vsplit elts code a bit
Roland Scheidegger [Sat, 12 Nov 2016 21:47:22 +0000 (22:47 +0100)]
draw: simplify vsplit elts code a bit

vsplit_get_base_idx explicitly returned idx 0 and set the ofbit
in case of overflow. We'd then check the ofbit and use idx 0 instead of
looking it up. This was necessary because DRAW_GET_IDX used to return
DRAW_MAX_FETCH_IDX and not 0 in case of overflows.
However, this is all unnecessary, we can just let DRAW_GET_IDX return 0
in case of overflow. In fact before bbd1e60198548a12be3405fc32dd39a87e8968ab
the code already did that, not sure why this particular bit was changed
(might have been one half of an attempt to get these indices to actual draw
shader execution - in fact I think this would make things less awkward, it
would require moving the eltBias handling to the shader as well).
Note there's other callers of DRAW_GET_IDX - those code paths however
explicitly do not handle index buffer overflows, therefore the overflow
value doesn't matter for them.

Also do some trivial simplification - for (unsigned) a + b, checking res < a
is sufficient for overflow detection, we don't need to check for res < b too
(similar for signed).

And an index buffer overflow check looked bogus - eltMax is the number of
elements in the index buffer, not the maximum element which can be fetched.
(Drop the start check against the idx buffer though, this is already covered
by end check and end < start).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
7 years agogallium: Add support for SWR compilation
George Kyriazis [Fri, 11 Nov 2016 18:09:36 +0000 (12:09 -0600)]
gallium: Add support for SWR compilation

Include swr library and include -DHAVE_SWR in the compile line.

v3: split to a separate commit

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agogallium: swr: Added swr build for windows
George Kyriazis [Fri, 18 Nov 2016 17:40:09 +0000 (11:40 -0600)]
gallium: swr: Added swr build for windows

v4: Add windows-specific gen_knobs.{cpp|h} changes
v5: remove aggresive squashing of gen_knobs.py to this commit; added
SConscript to EXTRA_DIST in Makefile.am

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoswr: Modify gen_knobs.{cpp|h} creation script
George Kyriazis [Fri, 18 Nov 2016 17:38:30 +0000 (11:38 -0600)]
swr: Modify gen_knobs.{cpp|h} creation script

Modify gen_knobs.py so that each invocation creates a single generated
file.  This is more similar to how the other generators behave.

v5: remove Scoscript edits from this commit; moved to commit that first
adds SConscript

Acked-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoscons: Add swr compile option
George Kyriazis [Wed, 16 Nov 2016 00:52:39 +0000 (18:52 -0600)]
scons: Add swr compile option

To buils The SWR driver (currently optional, not compiled by default)

v3: add option as opposed to target

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoswr: Windows-related changes
George Kyriazis [Fri, 11 Nov 2016 17:57:49 +0000 (11:57 -0600)]
swr: Windows-related changes

- Handle dynamic library loading for windows
- Implement swap for gdi
- fix prototypes
- update include paths on configure-based build for swr_loader.cpp

v2: split to multiple patches
v3: split and reshuffle some more; renamed title
v4: move Makefile.am changes to other commit. Modify header files

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoswr: renamed duplicate swr_create_screen()
George Kyriazis [Thu, 17 Nov 2016 22:21:12 +0000 (16:21 -0600)]
swr: renamed duplicate swr_create_screen()

There are 2 swr_create_screen() functions.  One in swr_loader.cpp, which
is used during driver init, and the other is hiding in swr_screen.cpp,
which ends up in the arch-specific .dll/.so.

Rename the second one to swr_create_screen_internal(), to avoid confusion
in header files.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoswr: Handle windows.h and NOMINMAX
George Kyriazis [Fri, 11 Nov 2016 17:44:05 +0000 (11:44 -0600)]
swr: Handle windows.h and NOMINMAX

Reorder header files so that we have a chance to defined NOMINMAX before
mesa include files include windows.h

v3: split from bigger patch

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agogallium: Added SWR support for gdi
George Kyriazis [Wed, 9 Nov 2016 22:40:58 +0000 (16:40 -0600)]
gallium: Added SWR support for gdi

Added hooks for screen creation and swap.  Still keep llvmpipe the default
software renderer.

v2: split from bigger patch
v3: reword commit message

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoscons: add llvm 3.9 support.
George Kyriazis [Wed, 9 Nov 2016 22:35:48 +0000 (16:35 -0600)]
scons: add llvm 3.9 support.

v2: reworded commit message

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoscons: ignore .hpp files in parse_source_list()
George Kyriazis [Wed, 9 Nov 2016 22:33:10 +0000 (16:33 -0600)]
scons: ignore .hpp files in parse_source_list()

Drivers that contain C++ .hpp files need to ignore them too, along
with .h files, when building source file lists.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agomesa: removed redundant #else
George Kyriazis [Wed, 9 Nov 2016 22:31:21 +0000 (16:31 -0600)]
mesa: removed redundant #else

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoi965/hsw: Set integer mode in sampling state for stencil texturing
Jordan Justen [Sat, 19 Nov 2016 22:52:29 +0000 (14:52 -0800)]
i965/hsw: Set integer mode in sampling state for stencil texturing

Fixes:

ES31-CTS.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_pot
ES31-CTS.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_npot
ES31-CTS.functional.texture.border_clamp.formats.depth32f_stencil8_sample_stencil.nearest_size_pot
ES31-CTS.functional.texture.border_clamp.formats.depth32f_stencil8_sample_stencil.nearest_size_npot
ES31-CTS.functional.texture.border_clamp.unused_channels.depth24_stencil8_sample_stencil
ES31-CTS.functional.texture.border_clamp.unused_channels.depth32f_stencil8_sample_stencil

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoreviewers: add Rob H for the Android EGL+build parts
Emil Velikov [Mon, 21 Nov 2016 15:59:50 +0000 (15:59 +0000)]
reviewers: add Rob H for the Android EGL+build parts

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agodocs: recommend using --enable-mangling over the manual -DUSE...
Emil Velikov [Wed, 29 Jun 2016 12:26:07 +0000 (13:26 +0100)]
docs: recommend using --enable-mangling over the manual -DUSE...

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agodocs: rework/update install.html
Emil Velikov [Wed, 29 Jun 2016 12:13:25 +0000 (13:13 +0100)]
docs: rework/update install.html

Still far from perfect, but a few small steps in the right direction.

 - Split build systems, compilers, third party tools
 - Mention building mesa for Android (part of AOSP)
 - Drop explicit "other" dependencies. Reference to disto methods to
get them.
 - HTML 4.01 Traditional compliance fixes - mixed ul and br tags.
 - nuke dead links README.{CYGWIN,VMS}

v2: Squash typos, add note about buggy flex 2.6.2 (Eric), add Suse
zipper command (Tobias).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agodocs: sourcetree.html misc updates
Emil Velikov [Wed, 29 Jun 2016 11:52:57 +0000 (12:52 +0100)]
docs: sourcetree.html misc updates

A mixed bag of updates/fixes - mostly aiming at removing no longer
applicable directories.

Add a few more state-trackers, drivers, etc. alongside "XXX more" where
applicable. Attribute for the GLSL/NIR movement and nukage of
src/egl/docs.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agodocs: flesh out releasing.html
Emil Velikov [Wed, 16 Nov 2016 18:25:41 +0000 (18:25 +0000)]
docs: flesh out releasing.html

Properly document the whole process:
 - Brief on what, when, where
 - Picking, testing, branchpoints, pre-release announcement
 - Releasing, announcement, website and bugzilla updates

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agodocs/submittingpatches: fix tags mis/abuse
Emil Velikov [Wed, 16 Nov 2016 11:54:54 +0000 (11:54 +0000)]
docs/submittingpatches: fix tags mis/abuse

Fix the odd tag so that we're HTML 4.01 Traditional compliant

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agodocs/submittingpatches: flesh out "how to nominate" methods
Emil Velikov [Wed, 16 Nov 2016 11:51:50 +0000 (11:51 +0000)]
docs/submittingpatches: flesh out "how to nominate" methods

Currently they are buried within the text, making it hard to find.
Move them to the top and be clear what is _not_ a good idea.

v2: Minor commit polish, use only "resending" as suggested by Matt.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agodocs/autoconf: update glx driver / enable-debug text
Emil Velikov [Wed, 29 Jun 2016 13:26:36 +0000 (14:26 +0100)]
docs/autoconf: update glx driver / enable-debug text

With earlier commit we folded all the xlib handling in --enable-glx, but
we forgot to update the documentation.

Elaborate on --enable-debug and drop mentions about depenencies.

v2: Grammar - s|haven't|hasn't| (Eric)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agodocs/repository: refer to Submitting patches
Emil Velikov [Wed, 29 Jun 2016 12:40:03 +0000 (13:40 +0100)]
docs/repository: refer to Submitting patches

v2: Improve grammar - add missing "to" (Eric).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agodocs: split Submitting Patches into separate document
Emil Velikov [Wed, 16 Nov 2016 00:20:56 +0000 (00:20 +0000)]
docs: split Submitting Patches into separate document

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agodocs: split Codying style into separate document
Emil Velikov [Wed, 16 Nov 2016 00:31:26 +0000 (00:31 +0000)]
docs: split Codying style into separate document

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agodocs: mention/suggest testing your patch against dEQP
Emil Velikov [Wed, 29 Jun 2016 10:55:47 +0000 (11:55 +0100)]
docs: mention/suggest testing your patch against dEQP

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agodocs: mention that coding style can differ between drivers
Emil Velikov [Wed, 29 Jun 2016 10:49:25 +0000 (11:49 +0100)]
docs: mention that coding style can differ between drivers

... and point people to use/honour the EditorConfig/Emacs files, where
applicable.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agorevieweds: add Tomasz for the Android/EGL implementation
Emil Velikov [Mon, 21 Nov 2016 14:33:44 +0000 (14:33 +0000)]
revieweds: add Tomasz for the Android/EGL implementation

As mentioned/requested on the mailing list.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agomesa: fold always true conditional
Emil Velikov [Fri, 11 Nov 2016 16:43:28 +0000 (16:43 +0000)]
mesa: fold always true conditional

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agomesa: drop unneeded assert
Emil Velikov [Fri, 11 Nov 2016 16:43:27 +0000 (16:43 +0000)]
mesa: drop unneeded assert

As seen a couple  of lines above - there's no way for the assert to
trigger.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agoegl/wayland: remove non-applicable destroyDrawable from error path
Emil Velikov [Fri, 11 Nov 2016 16:45:00 +0000 (16:45 +0000)]
egl/wayland: remove non-applicable destroyDrawable from error path

If we fail to create the drawable there's not much point in attampting
to destroy it.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
7 years agoloader: automake: whitespace cleanup
Emil Velikov [Thu, 17 Nov 2016 15:45:47 +0000 (15:45 +0000)]
loader: automake: whitespace cleanup

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
7 years agogbm: automake: remove unused defines
Emil Velikov [Thu, 17 Nov 2016 15:45:46 +0000 (15:45 +0000)]
gbm: automake: remove unused defines

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
7 years agointel: aubinator: Fix resource leak in gen_spec_load_from_path
Gwan-gyeong Mun [Sun, 20 Nov 2016 07:07:19 +0000 (16:07 +0900)]
intel: aubinator: Fix resource leak in gen_spec_load_from_path

This fixes resource leak in gen_spec_load_from_path XML_ParserCreate
failure path

CID 1373564

Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agoegl/android: Use gralloc::lock_ycbcr for resolving YUV formats (v2)
Tomasz Figa [Thu, 10 Nov 2016 07:55:53 +0000 (16:55 +0900)]
egl/android: Use gralloc::lock_ycbcr for resolving YUV formats (v2)

There is an interface that can be used to query YUV buffers for their
internal format. Specifically, if gralloc:lock_ycbcr() is given no SW
usage flags, it's supposed to return plane offsets instead of pointers.
Let's use this interface to implement support for YUV formats in Android
EGL backend.

v2: Fixes from Emil's review:
 a) Added comments for parts that might be not clear,
 b) Changed get_fourcc_yuv() to return -1 on failure,
 c) Changed is_yuv() to use bool.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
7 years agoegl/android: Get gralloc module in dri2_initialize_android() (v2)
Tomasz Figa [Thu, 10 Nov 2016 07:55:52 +0000 (16:55 +0900)]
egl/android: Get gralloc module in dri2_initialize_android() (v2)

Currently droid_open_device() gets a reference to the gralloc module
only for its own use and does not store it anywhere. To make it possible
to call gralloc methods from code added in further patches, let's
refactor current code to get gralloc module in dri2_initialize_android()
and store it in dri2_dpy.

v2: fixes from Emil's review:
 a) remove duplicate initialization of 'err'.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
7 years agoegl/android: Remove handling of RGB_888 pixel format
Tomasz Figa [Wed, 9 Nov 2016 08:32:54 +0000 (17:32 +0900)]
egl/android: Remove handling of RGB_888 pixel format

It is currently completely broken, as it ends up using RGBX_8888 on
hardware side, due to no way of distinguishing between these two in the
DRI API, while HAL_PIXEL_FORMAT_RGB_888 is clearly defined to be the
3-byte per pixel RGB format.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoradeonsi: Fix resource leak in gs_copy_shader allocation failure path
Gwan-gyeong Mun [Sun, 20 Nov 2016 04:19:57 +0000 (13:19 +0900)]
radeonsi: Fix resource leak in gs_copy_shader allocation failure path

CID 1394028

Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoglsl/lower_output_reads: remove unused mem_ctx
Nicolai Hähnle [Thu, 17 Nov 2016 20:53:35 +0000 (21:53 +0100)]
glsl/lower_output_reads: remove unused mem_ctx

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoglsl/lower_output_reads: bail early in tessellation control shaders
Nicolai Hähnle [Thu, 17 Nov 2016 20:52:32 +0000 (21:52 +0100)]
glsl/lower_output_reads: bail early in tessellation control shaders

This whole pass is a no-op.

Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoglsl/lower_output_reads: fix geometry shader output handling with conditional emit
Nicolai Hähnle [Thu, 17 Nov 2016 20:55:38 +0000 (21:55 +0100)]
glsl/lower_output_reads: fix geometry shader output handling with conditional emit

Consider a geometry shader that contains code like this:

   some_out = expr;

   if (cond) {
      ...
      EmitVertex();
   } else {
      ...
      EmitVertex();
   }

Both branches should see the correct value of some_out.

Since this is a rather subtle and rare case, I'm submitting a piglit test
for this as well.

GLSL says that the values of output variables are undefined after
EmitVertex(). With this change, the values will now be defined and
unmodified. This may reduce optimization opportunities in the probably
quite rare case where subsequent compiler passes cannot prove that the
value of the output variable is overwritten.

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: store group_size_variable in struct si_compute
Nicolai Hähnle [Fri, 18 Nov 2016 14:18:10 +0000 (15:18 +0100)]
radeonsi: store group_size_variable in struct si_compute

For compute shaders, we free the selector after the shader has been
compiled, so we need to save this bit somewhere else.  Also, make sure that
this type of bug cannot re-appear, by NULL-ing the selector pointer after
we're done with it.

This bug has been there since the feature was added, but was only exposed
in piglit arb_compute_variable_group_size-local-size by commit
9bfee7047b70cb0aa026ca9536465762f96cb2b1 (which is totally unrelated).

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoglsl: don't flatten if-blocks with dynamic array indices
Nicolai Hähnle [Thu, 17 Nov 2016 21:24:07 +0000 (22:24 +0100)]
glsl: don't flatten if-blocks with dynamic array indices

This fixes the regression of radeonsi in
glsl-1.10/execution/variable-indexing/vs-output-array-vec3-index-wr
caused by commit 74e39de9324d2d2333cda6adca50ae2a3fc36de2.

Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoanv/state: enable coordinate address rounding for Min/Mag filters
Iago Toral Quiroga [Fri, 18 Nov 2016 12:44:27 +0000 (13:44 +0100)]
anv/state: enable coordinate address rounding for Min/Mag filters

This patch improves pass rate of dEQP-VK.texture.explicit_lod.2d.sizes.*
from 68.0% (98/144) to 83.3% (120/144) by enabling sampler address
rounding mode when the selected filter is not nearest, which is the same
thing we do for OpenGL.

These tests check texture filtering for various texture sizes and mipmap
levels. The failures (without this patch) affect cases where the target
texture has odd dimensions (like 57x35) and either the Min or the Mag filter
is not nearest.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv: Implement a depth stall restriction on gen7
Jason Ekstrand [Sat, 19 Nov 2016 22:05:06 +0000 (14:05 -0800)]
anv: Implement a depth stall restriction on gen7

Fixes around 60 Vulkan CTS tests on Haswell

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
7 years agonvc0/ir: use levelZero flag when the lod is set to 0
Ilia Mirkin [Wed, 19 Oct 2016 02:46:36 +0000 (22:46 -0400)]
nvc0/ir: use levelZero flag when the lod is set to 0

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
7 years agoradv: spir-v allows texture size query with and without lod.
Dave Airlie [Fri, 18 Nov 2016 03:58:30 +0000 (03:58 +0000)]
radv: spir-v allows texture size query with and without lod.

The translation to llvm was failing here due to required lod.

This fixes some new  SteamVR shaders.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: fix image view creation for depth and stencil only
Dave Airlie [Tue, 15 Nov 2016 06:46:50 +0000 (06:46 +0000)]
radv: fix image view creation for depth and stencil only

This fixes the image view for sampling just the depth.

It removes some pointless swizzle code, and adds
a missing case for the x8_d24 format.

Fixes:
dEQP-VK.renderpass.formats.d32_sfloat_s8_uint.input.*
dEQP-VK.renderpass.formats.d24_unorm_s8_uint.input.*
dEQP-VK.renderpass.formats.x8_d24_unorm_pack32.input.*

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: make sure to flush input attachments correctly.
Dave Airlie [Wed, 16 Nov 2016 23:41:29 +0000 (23:41 +0000)]
radv: make sure to flush input attachments correctly.

This fixes 9 of the
dEQP-VK.renderpass.attachment_allocation.input_output.*
tests.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>