git.libre-soc.org Git - mesa.git/log

Rob Clark [Mon, 24 Jun 2019 22:06:13 +0000 (15:06 -0700)]

freedreno/a5xx: fix batch leak in fd5 blitter path

Fixes: 3d198926a48 freedreno: use fd_bc_alloc_batch instead of fd_batch_create.
Signed-off-by: Rob Clark <robdclark@chromium.org>

commit | commitdiff | tree

Marek Olšák [Wed, 19 Jun 2019 23:12:24 +0000 (19:12 -0400)]

radeonsi: don't set spi_ps_input_* for monolithic shaders

The driver doesn't use these values and ac_rtld has assertions
expecting the value of 0.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Marek Olšák [Tue, 18 Jun 2019 23:06:57 +0000 (19:06 -0400)]

radeonsi: rename and re-document cache flush flags

SMEM and VMEM caches are L0 on gfx10.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Marek Olšák [Fri, 24 May 2019 23:56:17 +0000 (19:56 -0400)]

radeonsi: fix AMD_DEBUG=nofmask

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Marek Olšák [Thu, 20 Jun 2019 02:24:51 +0000 (22:24 -0400)]

radeonsi: flatten the switch for DPBB tunables

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

commit | commitdiff | tree

Marek Olšák [Wed, 19 Jun 2019 23:00:50 +0000 (19:00 -0400)]

radeonsi: set the calling convention for inlined function calls

otherwise the behavior is undefined

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

commit | commitdiff | tree

Nicolai Hähnle [Fri, 21 Sep 2018 13:38:42 +0000 (15:38 +0200)]

radeonsi: refactor si_update_vgt_shader_config

We'll have to extend this at some point, and using a bitfield union in
this way makes it easier to get the right index without excessive
branching.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Nicolai Hähnle [Sun, 16 Jun 2019 23:24:29 +0000 (01:24 +0200)]

amd/rtld: update the ELF representation of LDS symbols

The initial prototype used a processor-specific symbol type, but
feedback suggests that an approach using processor-specific section
name that encodes the alignment analogous to SHN_COMMON symbols is
preferred.

This patch keeps both variants around for now to reduce problems
with LLVM compatibility as we switch branches around.

This also cleans up the error reporting in this function.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Marek Olšák [Thu, 20 Jun 2019 01:47:46 +0000 (21:47 -0400)]

ac/surface: remove addrlib_family_rev_id

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Dylan Baker [Mon, 24 Jun 2019 23:24:05 +0000 (16:24 -0700)]

docs: update calendar, add news item and link release notes for 19.0.7

commit | commitdiff | tree

Dylan Baker [Mon, 24 Jun 2019 23:21:34 +0000 (16:21 -0700)]

docs: Add SHA256 sums for 19.0.7

commit | commitdiff | tree

Dylan Baker [Mon, 24 Jun 2019 21:56:04 +0000 (14:56 -0700)]

Docs add 19.0.7 release notes

commit | commitdiff | tree

Ian Romanick [Thu, 20 Jun 2019 22:48:48 +0000 (15:48 -0700)]

glsl: Don't increase the iteration count when there are no terminators

Incrementing the iteration count was intended to fix an off-by-one error
when the first terminator was superseded by a later terminator.  If
there is no first terminator or later terminator, there is no off-by-one
error.  Incrementing the loop count creates one.  This can be seen in
loops like:

    do {
        if (something) {
            // No breaks or continues here.
        }
    } while (false);

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Abel Briggs <abelbriggs1@hotmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110953
Fixes: 646621c66da ("glsl: make loop unrolling more like the nir unrolling path")

commit | commitdiff | tree

Eric Anholt [Wed, 5 Jun 2019 22:15:07 +0000 (15:15 -0700)]

freedreno: Only upload the used part of UBO0 to the constant buffer.

We were pessimistically uploading all of it in case of indirection,
but we can just bump that when we encounter indirection.

total constlen in shared programs: 2529623 -> 2485933 (-1.73%)

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>

commit | commitdiff | tree

Eric Anholt [Wed, 5 Jun 2019 17:34:52 +0000 (10:34 -0700)]

freedreno: Stop treating UBO 0 specially in UBO uploading.

ir3_nir_analyze_ubo_ranges() has already told us how much of cb0 we
need to upload (all of it, since it will lower indirect UBO 0 accesses
from load_ubo back to indirection on the constant buffer).

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>

commit | commitdiff | tree

Rob Clark [Sun, 23 Jun 2019 19:52:05 +0000 (12:52 -0700)]

freedreno: Clamp UBO uploads to the constlen decided by the shader.

If the NIR-level analysis decided to move UBO loads to the constant
file, but the backend decided not to load those constants, we could
upload past the end of constlen. This is particularly relevant for
pre-a6xx, where we emit a different constlen between bin and render
variants.

(Fix by Rob, commit message by anholt)

Reviewed-by: Eric Anholt <eric@anholt.net>

commit | commitdiff | tree

Alyssa Rosenzweig [Fri, 21 Jun 2019 19:44:56 +0000 (12:44 -0700)]

panfrost: Allow up to 16 UBOs

This is the hardware max, as far as I can tell.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

commit | commitdiff | tree

Alyssa Rosenzweig [Fri, 21 Jun 2019 18:56:28 +0000 (11:56 -0700)]

panfrost: DRY between shader stage setup

Just a little spring cleanup, extending UBOs to vertex shaders in the
process.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

commit | commitdiff | tree

Alyssa Rosenzweig [Thu, 20 Jun 2019 22:51:31 +0000 (15:51 -0700)]

panfrost/midgard: Implement UBO reads

UBOs and uniforms now use a common code path with an explicit `index`
argument passed, enabling UBO reads.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

commit | commitdiff | tree

Alyssa Rosenzweig [Thu, 20 Jun 2019 23:51:08 +0000 (16:51 -0700)]

panfrost: Handle disabled/empty UBOs

Prevents an assert(0) later in this (not so edge) case. We still have to
have a dummy there.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

commit | commitdiff | tree

Alyssa Rosenzweig [Thu, 20 Jun 2019 23:41:39 +0000 (16:41 -0700)]

panfrost: Identify "uniform buffer count" bits

We've known about this for a while, but it was never formally in the
machine header files / decoder, so let's add them in.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

commit | commitdiff | tree

Alyssa Rosenzweig [Thu, 20 Jun 2019 23:32:06 +0000 (16:32 -0700)]

panfrost: Upload UBOs

Now that all the counting is sorted, it's a matter of passing along a
GPU address and going.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

commit | commitdiff | tree

Alyssa Rosenzweig [Thu, 20 Jun 2019 23:21:48 +0000 (16:21 -0700)]

panfrost: Allow for dynamic UBO count

We already uploaded UBOs, but only a fixed number (1) for uniforms;
let's upload as many as we compute we need.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

commit | commitdiff | tree

Alyssa Rosenzweig [Thu, 20 Jun 2019 23:16:07 +0000 (16:16 -0700)]

panfrost: Report UBO count

We look at the highest set bit in the UBO enable mask to work out the
maximum indexable UBO, i.e. the UBO count as we need to report to the
hardware.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

commit | commitdiff | tree

Alyssa Rosenzweig [Thu, 20 Jun 2019 23:07:57 +0000 (16:07 -0700)]

panfrost: Constant buffer refactor

We refactor panfrost_constant_buffer to mirror v3d's constant buffer
handling, to enable UBOs as well as a single set of uniforms.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

commit | commitdiff | tree

Alyssa Rosenzweig [Mon, 24 Jun 2019 18:53:58 +0000 (11:53 -0700)]

panfrost: Replace varyings for point sprites

This doesn't handle Y-flipping, but it's good enough to render the stars
in Neverball.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

commit | commitdiff | tree

Alyssa Rosenzweig [Mon, 24 Jun 2019 18:01:05 +0000 (11:01 -0700)]

panfrost: Track point sprites in fragment shader key

In preparation for lowering point sprites, track them like we track
alpha testing state.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

commit | commitdiff | tree

Caio Marcelo de Oliveira Filho [Sat, 22 Jun 2019 07:25:48 +0000 (00:25 -0700)]

i965: Move resources lowering after NIR linking

Those either depend on information filled by the NIR linking steps OR
are restricted by those:

- gl_nir_lower_samplers: depends on UniformStorage being set by the
  linker.

- brw_nir_lower_image_load_store: After 6981069fc80 "i965: Ignore
  uniform storage for samplers or images, use binding info" we want
  this pass to happen after gl_nir_lower_samplers.

- gl_nir_lower_buffers: depends on UniformBlocks and
  SharedStorageBlocks being set by the linker.

For the regular GLSL code path, those datastructures are filled
earlier.  For NIR linking code path we need to generate the nir_shader
first then process it -- and currently the processing works with all
shaders together.  So move the passes out of brw_create_nir into its
own function, called by the brwProgramStringNotify and
brw_link_shader().

This patch prepares ground for ARB_gl_spirv, that will make use of NIR
linker.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Caio Marcelo de Oliveira Filho [Sat, 22 Jun 2019 00:11:54 +0000 (17:11 -0700)]

glsl/nir: Fix copying 64-bit values in uniform storage

The iterator `i` already walks the right amount now that is
incremented by `dmul`, so no need to `* 2`. Fixes invalid memory
access in upcoming ARB_gl_spirv tests.

Failure bisected by Arcady Goldmints-Orlov.

Fixes: b019fe8a5b6 "glsl/nir: Fix handling of 64-bit values in uniform storage"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

commit | commitdiff | tree

Caio Marcelo de Oliveira Filho [Fri, 21 Jun 2019 23:55:08 +0000 (16:55 -0700)]

glsl/nir: Fix copying vector constant values

For n_columns == 1, we have a vector which is handled by the else
case. Fixes invalid memory access in upcoming ARB_gl_spirv tests.

Failure bisected by Arcady Goldmints-Orlov.

Fixes: 81e51b412e9 "nir: Make nir_constant a vector rather than a matrix"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

commit | commitdiff | tree

Daniel Schürmann [Fri, 25 Jan 2019 15:24:55 +0000 (16:24 +0100)]

amd/common: lower bitfield_extract to ubfe/ibfe.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Daniel Schürmann [Fri, 25 Jan 2019 15:08:38 +0000 (16:08 +0100)]

amd/common: lower bitfield_insert to bfm & bitfield_select

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Daniel Schürmann [Thu, 13 Jun 2019 09:34:01 +0000 (11:34 +0200)]

nir: introduce lowering of bitfield_insert to bfm and a new opcode bitfield_select.

bitfield_select is defined as:
bitfield_select(mask, base, insert) = (mask & base) | (~mask & insert)
matching the behavior of AMD's BFI instruction.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Daniel Schürmann [Thu, 2 May 2019 13:28:59 +0000 (15:28 +0200)]

nir/algebraic: Use unsigned comparison when lowering bitfield insert/extract

This lets us use the optimization pattern
(('ult', 31, ('iand', b, 31)), False) to remove the
bcsel instruction for code originating in D3D shaders.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Daniel Schürmann [Fri, 25 Jan 2019 12:56:49 +0000 (13:56 +0100)]

nir/algebraic: Remove unnecessary iand of [iu]bfe and bfm sources

The [iu]bfe and bfm instructions are defined to only use the five
least significant bits.
This optimizes a common pattern from D3D -> SPIR-V translation.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Daniel Schürmann [Sat, 26 Jan 2019 08:12:46 +0000 (09:12 +0100)]

nir: define behavior of nir_op_bfm and nir_op_u/ibfe according to SM5 spec.

That is: the five least significant bits provide the values of
'bits' and 'offset' which is the case for all hardware currently
supported by NIR and using the bfm/bfe instructions.
This patch also changes the lowering of bitfield_insert/extract
using shifts to not use bfm and removes the flag 'lower_bfm'.

Tested-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

commit | commitdiff | tree

Daniel Schürmann [Fri, 25 Jan 2019 11:48:44 +0000 (12:48 +0100)]

nir/algebraic: add optimization pattern for ('ult', a, ('and', b, a)) and friends.

These optimizations are based on the fact that
'and(a,b) <= umin(a,b)'.
For AMD, this series moves the optimization from LLVM to NIR,
so currently no vkpipeline-db changes here.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

commit | commitdiff | tree

Andreas Baierl [Fri, 21 Jun 2019 14:13:44 +0000 (16:13 +0200)]

lima/ppir: Add fsat op

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>

commit | commitdiff | tree

Andreas Baierl [Fri, 21 Jun 2019 08:54:04 +0000 (10:54 +0200)]

lima/ppir: Add fneg op

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>

commit | commitdiff | tree

Andreas Baierl [Fri, 21 Jun 2019 08:50:39 +0000 (10:50 +0200)]

lima/ppir: Add fabs op

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>

commit | commitdiff | tree

Eric Engestrom [Fri, 21 Jun 2019 10:35:08 +0000 (11:35 +0100)]

util: support "y" and "n" in env_var_as_boolean()

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>

commit | commitdiff | tree

Andreas Baierl [Mon, 17 Jun 2019 15:16:24 +0000 (17:16 +0200)]

lima/ppir: lower ffma in ppir

Since we cannot handle ffma in ppir, lower it on nir level already.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>

commit | commitdiff | tree

Samuel Pitoiset [Fri, 21 Jun 2019 14:17:22 +0000 (16:17 +0200)]

radv: add support for VK_AMD_buffer_marker

This simple extension might be useful for debugging purposes.
GAPID has support for it.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Tapani Pälli [Tue, 18 Jun 2019 10:50:52 +0000 (13:50 +0300)]

meson: error out if platforms contains empty string

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110939
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>

commit | commitdiff | tree

Nataraj Deshpande [Tue, 11 Jun 2019 15:01:50 +0000 (08:01 -0700)]

anv: Add HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED in vk_format

When HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED is used, then the platform
gralloc module will select a format based on the usage flags provided by
the camera device and the other endpoint of the stream.

The patch fixes crash in vulkan when the test is run with camera stream
set to HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED.

Test: android.graphics.cts.CameraVulkanGpuTest#testCameraImportAndRendering
on chromebook with camera HAL3.

v2: use AHARDWAREBUFFER_FORMAT_IMPLEMENTATION_DEFINED and take
AHARDWAREBUFFER_USAGE_CAMERA_MASK in to account (Gurchetan)

Fixes: f1654fa7e31 "anv/android: support creating images from external format"
Signed-off-by: Nataraj Deshpande <nataraj.deshpande@intel.com>
Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>

commit | commitdiff | tree

Timur Kristóf [Fri, 14 Jun 2019 12:03:28 +0000 (14:03 +0200)]

iris: move sysvals to their own constant buffer

This commit moves the sysvals to a separate, new constant buffer
at the end (before the shader constants). It also allows us to
remove the special handling we had for cbuf0, and enables all
constant buffers to support user-specified resources and user
buffers.

v2: (by Kenneth Graunke)
- Rebase on the previous patch to fix system value uploading.
- Fix disk cache num_cbufs calculation
- Fix passthrough TCS to report num_cbufs = 1 so upload actually occurs
- Change upload_sysvals to assert that num_cbufs > 0 when
num_system_values > 0.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Kenneth Graunke [Thu, 20 Jun 2019 23:28:58 +0000 (18:28 -0500)]

iris: Mark cbuf0 as not needing uploading every single time

I neglected to mark cbuf0_needs_upload = false after uploading it.
The obvious fix regressed user clip plane tests, because of a second
bug: we also forgot to mark that they may need re-uploading when
changing shader programs (which may have more or less system values).

Thanks to Timur Kristóf for catching the original issue.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>

commit | commitdiff | tree

Eric Engestrom [Sat, 22 Jun 2019 20:55:03 +0000 (21:55 +0100)]

Revert "egl: drop empty eglfallbacks.c" and "egl: move fallback calls to eglapi.c"

This reverts commits cc4b68a80193e2a132cb62309292984a9428f2bb and
b27fb3eacab906ec06cd61b7d01e3425c3b3cbfc.

These caused a bunch of EGLSync tests to crash when they were previously
failing.

I have a hunch the tests are doing something wrong, like using
extensions without checking for they support, but until the issue is
investigated I'm just reverting these commits.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>

commit | commitdiff | tree

Eric Engestrom [Sat, 22 Jun 2019 15:09:48 +0000 (16:09 +0100)]

egl: drop empty eglfallbacks.c

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

commit | commitdiff | tree

Eric Engestrom [Thu, 10 Jan 2019 16:45:53 +0000 (16:45 +0000)]

egl: move fallback calls to eglapi.c

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

commit | commitdiff | tree

Eric Engestrom [Thu, 10 Jan 2019 13:12:40 +0000 (13:12 +0000)]

egl: drop `_eglReturnFalse()` fallbacks

v2: drop them altogether, they should never get called in the
first place (Emil)

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>

commit | commitdiff | tree

Eric Engestrom [Thu, 10 Jan 2019 16:41:38 +0000 (16:41 +0000)]

egl: remove unnecessary eglGetProcAddress() fallback

No need to add a function that returns `false` only to be cast into
a pointer, we can just use the existing `return NULL` :)

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

commit | commitdiff | tree

Eric Engestrom [Thu, 10 Jan 2019 13:07:29 +0000 (13:07 +0000)]

egl: remove NULL assignments after calloc()

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>

commit | commitdiff | tree

Eric Engestrom [Tue, 8 Jan 2019 11:14:35 +0000 (11:14 +0000)]

egl: move bad_param check further up

This way other functions added in these entrypoints don't need to check
anything.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>

commit | commitdiff | tree

Kenneth Graunke [Fri, 21 Jun 2019 23:05:27 +0000 (18:05 -0500)]

iris: Drop bo != NULL check from blorp 48b invalidate function.

There is always a BO.

commit | commitdiff | tree

Kenneth Graunke [Fri, 21 Jun 2019 23:04:52 +0000 (18:04 -0500)]

Revert "iris: Don't check VF address high bits when there is no buffer."

This reverts commit db8f57a5cb4ab8e1ad789793678797c04e95de21.

This is bonkers. There will always be a BO.

commit | commitdiff | tree

Eric Anholt [Wed, 5 Jun 2019 22:39:22 +0000 (15:39 -0700)]

freedreno: Only upload UBO pointers for UBOs that haven't been lowered.

total constlen in shared programs: 2485933 -> 2462236 (-0.95%)

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>

commit | commitdiff | tree

Eric Anholt [Wed, 5 Jun 2019 18:43:13 +0000 (11:43 -0700)]

freedreno: Remove silly return from ir3_optimize_nir().

We only ever return the shader we were passed in (but internally
modified).

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>

commit | commitdiff | tree

Eric Anholt [Thu, 6 Jun 2019 21:27:13 +0000 (14:27 -0700)]

freedreno: Fix up end range of unaligned UBO loads.

We need the constants uploaded to cover the NIR offset plus the size,
not the aligned-down start of our upload range plus the size. Fixes
mistaken UBO analysis with mat3 loads.

Fixes: 893425a607a6 ("freedreno/ir3: Push UBOs to constant file")
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>

commit | commitdiff | tree

Eric Anholt [Thu, 6 Jun 2019 19:19:06 +0000 (12:19 -0700)]

freedreno: Fix UBO load range detection on booleans.

NIR 1-bit bool dests will have a bit size of 1, and thus a calculated
"bytes" of 0. load_ubo is always loading from dwords in the source.

Fixes: 893425a607a6 ("freedreno/ir3: Push UBOs to constant file")
Reviewed-by: Rob Clark <robdclark@gmail.com>

commit | commitdiff | tree

Eric Anholt [Wed, 5 Jun 2019 22:00:57 +0000 (15:00 -0700)]

freedreno: Stop reporting max_const in shader-db.

We end up uploading constlen regardless, so max_const would only get
you slightly improved granularity in const usage in comparison.

Reviewed-by: Rob Clark <robdclark@gmail.com>

commit | commitdiff | tree

Eric Anholt [Wed, 5 Jun 2019 18:29:19 +0000 (11:29 -0700)]

freedreno: Include binning shaders in shader-db.

We want to see if we've improved our binning VS output, as well as the
render VS.

Reviewed-by: Rob Clark <robdclark@gmail.com>

commit | commitdiff | tree

Marek Olšák [Tue, 11 Jun 2019 22:27:04 +0000 (18:27 -0400)]

include: update GL headers from the registry

Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Alyssa Rosenzweig [Fri, 21 Jun 2019 20:06:04 +0000 (13:06 -0700)]

panfrost: Fix unused variable warning

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

commit | commitdiff | tree

Boris Brezillon [Wed, 19 Jun 2019 14:06:38 +0000 (16:06 +0200)]

panfrost: Remove the panfrost_driver abstraction

The non-DRM backend is gone. Let's get rid of the panfrost_driver
abstraction and call the panfrost_drm_xxx() functions directly.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

commit | commitdiff | tree

Boris Brezillon [Wed, 19 Jun 2019 14:05:01 +0000 (16:05 +0200)]

panfrost: Remove the perf counters interface

The DRM backend has a dummy implementation and the non-DRM backend is
gone, so let's remove this perf counter interface.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

commit | commitdiff | tree

Tomeu Vizoso [Fri, 21 Jun 2019 10:47:57 +0000 (12:47 +0200)]

panfrost: ci: Fix parsing of crashed tests

Without this fix, LAVA isn't parsing crashes as failed tests, because
the shell logging is interspersed within the fake deqp output.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

commit | commitdiff | tree

Alyssa Rosenzweig [Thu, 20 Jun 2019 22:35:22 +0000 (15:35 -0700)]

panfrost: Conditionally submit fragment job

If there are no tiling jobs and no clears, there is no need to submit a
fragment job (relevant for transform feedback).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

commit | commitdiff | tree

Alyssa Rosenzweig [Thu, 20 Jun 2019 22:25:17 +0000 (15:25 -0700)]

panfrost: Implement rasterizer discard

D'aww, look how cute that is now that scoreboarding is setup.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

commit | commitdiff | tree

Alyssa Rosenzweig [Thu, 20 Jun 2019 21:05:33 +0000 (14:05 -0700)]

panfrost: Track buffer initialization

We want to know if a given slice of a buffer is initialized at a
particular point in the execution of the program. This is accomplished
easily enough -- start out uninitialized and upon an operation writing
to the buffer, mark it initialized.

The motivation is to optimize away expensive operations (like wallpaper
blits) when reading from an uninitialized buffer; since it's
uninitialized, the results of these operations are undefined, and it's
legal to take the fast path ^_^

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

commit | commitdiff | tree

Alyssa Rosenzweig [Wed, 19 Jun 2019 18:27:59 +0000 (11:27 -0700)]

panfrost: Implement command stream scoreboarding

This is a rather complex change, adding a lot of code but ideally
cleaning up quite a bit as we go.

Within a batch (single frame), there are multiple distinct Mali job
types: SET_VALUE, VERTEX, TILER, FRAGMENT for the few that we emit right
now (eventually more for compute and geometry shaders). Each hardware
job has a mali_job_descriptor_header, which contains three fields of
interest: job index, a dependencies list, and a next job pointer.

The next job pointer in each job is used to form a linked list of
submitted jobs. Easy enough.

The job index and dependencies list, however, are used to form a
dependency graph (a DAG, where each hardware job is a node and each
dependency is a directed edge). Internally, this sets up a scoreboarding
data structure for the hardware to dispatch jobs in parallel, enabling
(for example) vertex shaders from different draws to execute in parallel
while there are strict dependencies between tiling the geometry of a
draw and running that vertex shader.

For a while, we got by with an incredible series of total hacks,
manually coding indices, lists, and dependencies. That worked for a
moment, but combinatorial kaboom kicked in and it became an
unmaintainable mess of spaghetti code.

We can do better. This commit explicitly handles the scoreboarding by
providing high-level manipulation for jobs. Rather than a command like
"set dependency #2 to index 17", we can express quite naturally "add a
dependency from job T on job V". Instead of some open-coded logic to
copy a draw pointer into a delicate context array, we now have an
elegant exposed API to simple "queue a job of type XYZ".

The design is influenced by both our current requirements (standard ES2
draws and u_blitter) as well as the need for more complex scheduling in
the future. For instance, blits can be optimized to use only a tiler
job, without a vertex job first (since the screen-space vertices are
known ahead-of-time) -- causing tiler-only jobs. Likewise, when using
transform feedback with rasterizer discard enabled, vertex jobs are
created (to run vertex shaders) with no corresponding tiler job. Both of
these cases break the original model and could not be expressed with the
open-coded logic. More generally, this will make it easier to add
support for compute shaders, geometry shaders, and fused jobs (an
optimization available on Bifrost).

Incidentally, this moves quite a bit of state from the driver context to
the batch, which helps with Rohan's refactor to eventually permit
pipelining across framebuffers (one important outstanding optimization
for FBO-heavy workloads).

v2: Add comment explaining the meaning of "primary batch" as suggested
by Tomeu (trivial - not reviewed).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Rohan Garg <rohan.garg@collabora.com>

commit | commitdiff | tree

Anuj Phogat [Fri, 14 Jun 2019 00:34:46 +0000 (17:34 -0700)]

intel/icl: Add new ICL PCI-IDs

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 19 Jun 2019 22:07:43 +0000 (17:07 -0500)]

anv: Implement "pop-free" clipping

This is the preferred clipping mode since it doesn't mean your points
disappear the moment part of the point crosses over the edge of the
viewport and that lines have weird endpoints at viewport edges. We've
just never bothered to hook it up until now.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 19 Jun 2019 21:04:54 +0000 (16:04 -0500)]

anv: Enable the guardband clip test

In workloads where there is a lot of geometry drawn that crosses over
the edge of the viewport, this should substantially improve clipper
performance. Not really sure why it's taken 3 years to turn it on but
we never got around to it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 19 Jun 2019 20:52:55 +0000 (15:52 -0500)]

i965,iris: Move guardband calculations to a common location

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

commit | commitdiff | tree

Mauro Rossi [Sat, 15 Jun 2019 05:39:02 +0000 (07:39 +0200)]

android: virgl: fix libmesa_winsys_virgil_common build and dependencies

Fixes the following building errors and resolves Bug 110922
Fixes gallium_dri target missing symbols at linking.

external/mesa/src/gallium/winsys/virgl/drm/Android.mk:
error: libmesa_winsys_virgl (STATIC_LIBRARIES android-x86_64) missing libmesa_winsys_virgl_common (STATIC_LIBRARIES android-x86_64)
...
external/mesa/src/gallium/winsys/virgl/vtest/Android.mk:
error: libmesa_winsys_virgl_vtest (STATIC_LIBRARIES android-x86_64) missing libmesa_winsys_virgl_common (STATIC_LIBRARIES android-x86_64)
...
build/core/main.mk:728: error: exiting from previous errors.

In file included from external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_socket.c:34:
external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_winsys.h:35:10:
fatal error: 'virgl_resource_cache.h' file not found
^~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

In file included from external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_winsys.c:32:
external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_winsys.h:35:10:
fatal error: 'virgl_resource_cache.h' file not found
#include "virgl_resource_cache.h"
^~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

Fixes: b18f09a ("virgl: Introduce virgl_resource_cache")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Clayton Craft <clayton.a.craft@intel.com>

commit | commitdiff | tree

Mauro Rossi [Sat, 8 Jun 2019 13:15:09 +0000 (15:15 +0200)]

android: winsys/amdgpu,radv: fix generated amdgfxregs.h header dependecies

Fix android building errors in winsys/amdgpu and radv
due to 'amdgfxregs.h' not found.

Changelog:
amd/common - generated $(intermediated)/common path is added to exports
winsys/amdgpu - libmesa_amd_common static dependency is added
radv - correct generated $(intermediated)/common path is added to includes

Fixes: f480b8a ("amd/common: use generated register header")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

commit | commitdiff | tree

Samuel Pitoiset [Mon, 20 May 2019 09:46:33 +0000 (11:46 +0200)]

radv: add support for VK_KHR_depth_stencil_resolve

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Wed, 12 Jun 2019 09:39:58 +0000 (11:39 +0200)]

radv: pass sample locations for transitions before depth/stencil resolves

HTILE decompressions need the user sample locations if specified
in the current subpass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Wed, 22 May 2019 09:23:03 +0000 (11:23 +0200)]

radv: clear the depth/stencil resolve attachment if necessary

The driver might need to clear one aspect of the depth/stencil
resolve attachment before performing the resolve itself.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Wed, 22 May 2019 12:56:01 +0000 (14:56 +0200)]

radv: decompress HTILE if the resolve src image is compressed

It's required to decompress HTILE before resolving with the
compute path.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Wed, 22 May 2019 07:45:19 +0000 (09:45 +0200)]

radv: select the depth/stencil resolve method based on some conditions

Only fallback to the compute path for layers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Wed, 22 May 2019 07:42:12 +0000 (09:42 +0200)]

radv: implement all depth/stencil resolve modes using compute

This path supports layers but it requires to decompress HTILE
before resolving. The driver also needs to fixup HTILE after
the resolve. This path is probably slower than the graphics one.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Wed, 22 May 2019 07:43:39 +0000 (09:43 +0200)]

radv: implement all depth/stencil resolve modes using graphics

When using graphics, the driver doesn't need to decompress HTILE
before resolving. This path currently doesn't support layers
so we have to fallback to the compute path.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Mon, 20 May 2019 09:47:02 +0000 (11:47 +0200)]

radv: record if a render pass has depth/stencil resolve attachments

Only supported with vkCreateRenderPass2().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Wed, 12 Jun 2019 08:20:41 +0000 (10:20 +0200)]

radv: rename has_resolve to has_color_resolve

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Thu, 13 Jun 2019 11:56:22 +0000 (13:56 +0200)]

radv: emit framebuffer state from primary if secondary doesn't inherit it

Otherwise fast color/depth clears can't work because they depend
on the framebuffer.

This fixes the following CTS (when the small hint is disabled):
- dEQP-VK.geometry.layered.1d_array.secondary_cmd_buffer
- dEQP-VK.geometry.layered.2d_array.secondary_cmd_buffer
- dEQP-VK.geometry.layered.cube.secondary_cmd_buffer
- dEQP-VK.geometry.layered.cube_array.secondary_cmd_buffer

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110810
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107986
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Eric Engestrom [Fri, 23 Nov 2018 17:04:25 +0000 (17:04 +0000)]

drisw: move build logic to build systems

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>

commit | commitdiff | tree

Tomeu Vizoso [Fri, 21 Jun 2019 06:10:57 +0000 (08:10 +0200)]

panfrost: ci: Exclude two more flip-flop from results

These three tests pass on RK3399, but fail on RK3288:

dEQP-GLES2.functional.shaders.matrix.div.const_lowp_mat2_mat2_vertex
dEQP-GLES2.functional.shaders.operator.unary_operator.pre_increment_effect.highp_ivec4_vertex
dEQP-GLES2.functional.shaders.texture_functions.vertex.texture2dprojlod_vec3

They reliably pass when run individually, but reliably fail when run in
a full CI run.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>

commit | commitdiff | tree

Gert Wollny [Thu, 20 Jun 2019 13:38:30 +0000 (15:38 +0200)]

gallium/st: Add Gallium hud to swrast drivers

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Iago Toral Quiroga [Wed, 19 Jun 2019 08:28:12 +0000 (10:28 +0200)]

v3d: flush jobs writing to vertex buffers used in the current draw call

This can happen when any of our vertex buffers was written by a previous
transform feedback draw.

Fixes the following piglit tests:
spec/ext_transform_feedback/position-render-bufferbase
spec/ext_transform_feedback/position-render-bufferbase-discard
spec/ext_transform_feedback/position-render-bufferoffset
spec/ext_transform_feedback/position-render-bufferoffset-discard
spec/ext_transform_feedback/position-render-bufferrange
spec/ext_transform_feedback/position-render-bufferrange-discard

Reviewed-by: Eric Anholt <eric@anholt.net>

commit | commitdiff | tree

Iago Toral Quiroga [Wed, 19 Jun 2019 07:48:12 +0000 (09:48 +0200)]

v3d: flush jobs reading from transform feedback output buffers

If we are about to write to a transform feedback buffer, we should
make sure that we flush any prior work that intended to read from
any of these buffers.

Fixes piglit test:
spec/ext_transform_feedback/immediate-reuse

Reviewed-by: Eric Anholt <eric@anholt.net>

commit | commitdiff | tree

Iago Toral Quiroga [Wed, 19 Jun 2019 08:23:43 +0000 (10:23 +0200)]

v3d: add a helper to check if transform feedback is enabled

v2: We should be safe assuming that bind_vs != NULL (Eric)

Reviewed-by: Eric Anholt <eric@anholt.net>

commit | commitdiff | tree

Dave Airlie [Wed, 19 Jun 2019 20:47:08 +0000 (06:47 +1000)]

llvmpipe: make remove_shader_variant static.

this isn't used outside this file.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>

commit | commitdiff | tree

Eric Engestrom [Wed, 1 May 2019 10:51:01 +0000 (11:51 +0100)]

util/os_file: resize buffer to what was actually needed

Fixes: 316964709e21286c2af5 "util: add os_read_file() helper"
Reported-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

commit | commitdiff | tree

Tomeu Vizoso [Thu, 20 Jun 2019 18:57:30 +0000 (20:57 +0200)]

panfrost: ci: Update expectations

These tests have been fixed recently.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>

commit | commitdiff | tree

Alyssa Rosenzweig [Wed, 19 Jun 2019 14:23:27 +0000 (07:23 -0700)]

panfrost/midgard: Broadcast swizzle

Fixes regression in shaders using ball/etc by explicitly passing through
the number of channels in the NIR op and broadcasting the last
components of the channel appropriately, as the Midgard ops are all vec4
implicitly but NIR can be vec2/3.

v2: Don't also regress every other swizzle in Equestria.

v3: Don't regress the swizzles at Canterlot High either.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>

commit | commitdiff | tree

Kenneth Graunke [Thu, 20 Jun 2019 06:08:04 +0000 (01:08 -0500)]

iris: Use stream uploader for shader draw parameters.

Most vertex data lives in user VBOs in IRIS_MEMZONE_OTHER, which
typically have high bits set to 0xffff. The shader draw parameters were
being uploaded in IRIS_MEMZONE_DYNAMIC, which have high bets set to 0x2.
This was causing a lot of ping-ponging of high bits, leading to
unnecessary VF cache flushing.

Cuts 7.2% of the flushes in the Civizilation VI demo on Kabylake GT2.

commit | commitdiff | tree

Kenneth Graunke [Thu, 20 Jun 2019 05:47:33 +0000 (00:47 -0500)]

iris: Don't check VF address high bits when there is no buffer.

If there is no buffer, then it doesn't matter. Leave the old stale
high bits in place (for next time) and don't bother invalidating.

Cuts 5.6% of the flushes in the Civilization VI demo on Kabylake GT2.

commit | commitdiff | tree

Kenneth Graunke [Thu, 20 Jun 2019 04:12:52 +0000 (23:12 -0500)]

iris: Drop RT flushes from depth stencil clearing flushes.

These write depth and stencil, not color writes, so there's no need
to flush the render target.

RSS Atom