Juan A. Suarez Romero [Mon, 29 Apr 2019 15:10:24 +0000 (17:10 +0200)]
anv: enable descriptor indexing capabilities
This enables the remaining capabilities in SPV_EXT_descriptor_indexing.
Fixes: 6e230d7607f "anv: Implement VK_EXT_descriptor_indexing"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Juan A. Suarez Romero [Mon, 29 Apr 2019 15:05:13 +0000 (17:05 +0200)]
radv: enable descriptor indexing capabilities
This enables the remaining capabilities in SPV_EXT_descriptor_indexing.
Fixes: 0e10790558b "radv: Enable VK_EXT_descriptor_indexing."
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Juan A. Suarez Romero [Mon, 29 Apr 2019 15:02:45 +0000 (17:02 +0200)]
spirv: add missing SPV_EXT_descriptor_indexing capabilities
Add ShaderNonUniformEXT, UniformBufferArrayNonUniformIndexingEXT,
SampledImageArrayNonUniformIndexingEXT,
StorageBufferArrayNonUniformIndexingEXT,
StorageImageArrayNonUniformIndexingEXT,
InputAttachmentArrayNonUniformIndexingEXT,
UniformTexelBufferArrayNonUniformIndexingEXT and
StorageTexelBufferArrayNonUniformIndexingEXT capabilities.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Caio Marcelo de Oliveira Filho [Tue, 30 Apr 2019 00:07:01 +0000 (17:07 -0700)]
spirv: Properly handle SpvOpAtomicCompareExchangeWeak
The code was handling the Weak variant in some cases, but missing
others, e.g. the get_deref_nir_atomic_op. Add all the missing cases
with the same behavior of the non-Weak SpvOpAtomicCompareExchange.
Note that the Weak variant is basically an alias, as SPIR-V 1.3,
Revision 7 says
"OpAtomicCompareExchangeWeak
Deprecated (use OpAtomicCompareExchange).
Has the same semantics as OpAtomicCompareExchange."
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tomeu Vizoso [Mon, 29 Apr 2019 16:33:22 +0000 (16:33 +0000)]
panfrost/ci: Initial commit
These files implement running almost all of deqp-gles2 on Chomebooks of
the rk3399-gru-kevin type in Collabora's LAVA lab.
The approach follows what is currently being used for virglrenderer,
but scheduling the actual test jobs via LAVA.
We start by building a container in Docker that contains a suitable
rootfs and kernel for the DUT, deqp and all dependencies for building
Mesa itself.
The Mesa is built and the rootfs, deqp and Mesa are combined in a cpio
ramdisk. A LAVA job is generated, submitted to LAVA and the results are
processed by simply comparing them to the expectations that are stored
in git. Any code that changes the expectations (hopefully tests are
fixed) needs to also update the expectations file.
The next step is adding support for other devices, possibly in other
LAVA labs.
In order to use this, the repository has to be configured to run the
gitlab-ci.yaml file from the panfrost/ci dir, and a LAVA token needs to
be setup.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Rafael Antognolli [Mon, 29 Apr 2019 23:02:58 +0000 (16:02 -0700)]
iris: Do not advertise multisampled image load/store.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Rob Clark [Sun, 28 Apr 2019 16:35:15 +0000 (09:35 -0700)]
freedreno/a6xx: pre-bake UBWC flags in texture-view
Small cleanup. No need to defer this to emit time.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Rob Clark [Sun, 28 Apr 2019 16:23:29 +0000 (09:23 -0700)]
freedreno/a6xx: small texture emit cleanup
Prep work for fb_read (blend_equation_advanced)
Switch to using 'enum pipe_shader_type' everywhere, and (optional, in
non-cache / slowpath case) pass ctx instead of image/ssbo state. In the
fb_read case we also need to access the framebuffer state, so having
the ctx simplifies things.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Rob Clark [Fri, 26 Apr 2019 21:40:17 +0000 (14:40 -0700)]
freedreno/ir3: switch fragcoord to sysval
Because who are we kidding... it is a sysval.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Plamena Manolova [Mon, 29 Apr 2019 19:57:58 +0000 (22:57 +0300)]
i965: Re-enable fast color clears for GEN11.
This patch re-enables fast color clears for GEN11.
It also ensures that we use linear color formats
for sRGB surfaces during fast clears.
Signed-off-by: Plamena Manolova <plamena.n.manolova@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Rafael Antognolli [Wed, 3 Apr 2019 00:08:52 +0000 (17:08 -0700)]
intel/blorp: Make blorp update the clear color in gen11.
Hardware docs say that Gen11 requires the use of two MI_ATOMICs of size
QWORD when updating the clear color. The second MI_ATOMIC also needs CS
Stall and Return Data Control set.
v2: Remove include of srgb header (Lionel)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Rafael Antognolli [Wed, 3 Apr 2019 00:07:17 +0000 (17:07 -0700)]
intel/genxml: Update MI_ATOMIC genxml definition.
Change some of the single bit fields to booleans, and add an enum with
the definition of the ATOMIC_OPCODE.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jordan Justen [Fri, 19 Apr 2019 22:31:29 +0000 (15:31 -0700)]
intel/genxml: Support base-16 in value & start fields in gen_sort_tags.py
With python's int(), if the optional second parameter is 0, then
python will support the 0x prefix for hex numbers.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Plamena Manolova [Thu, 14 Mar 2019 20:28:20 +0000 (22:28 +0200)]
isl: Set ClearColorConversionEnable.
The ClearColorConversionEnable bit needs to be set
for GEN11 when inderect clear colors are used.
Signed-off-by: Plamena Manolova <plamena.n.manolova@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Eric Engestrom [Wed, 24 Apr 2019 11:26:58 +0000 (12:26 +0100)]
delete autotools input files
Leftovers from when autotools was deleted.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Eric Engestrom [Wed, 24 Apr 2019 11:18:46 +0000 (12:18 +0100)]
delete autotools .gitignore files
One special case, `src/util/xmlpool/.gitignore` is not entirely deleted,
as `xmlpool.pot` still gets generated (eg. by `ninja xmlpool-pot`).
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Kenneth Graunke [Mon, 29 Apr 2019 20:25:12 +0000 (13:25 -0700)]
iris: Only enable GL_AMD_depth_clamp_separate on Gen9+
The hardware feature is new as of Gen9+. I accidentally enabled it on
Gen8.
Kenneth Graunke [Mon, 29 Apr 2019 06:25:10 +0000 (23:25 -0700)]
iris: Set XY Clipping correctly.
I was setting it based off a pipe_rasterizer_state field that appears
to be entirely dead outside of the draw module respecting it.
I should be setting it when the primitive type reaching the SF is
neither points nor lines. This is, unfortunately, rather dirty,
as we have to look at the rasterizer state, the geometry shader state,
the tessellation evaluation shader state, and the primitive type...
Rhys Perry [Thu, 25 Apr 2019 13:44:40 +0000 (14:44 +0100)]
ac,ac/nir: use a better sync scope for shared atomics
https://reviews.llvm.org/rL356946 (present in LLVM 9 and later) changed
the meaning of the "system" sync scope, making it no longer restricted to
the memory operation's address space. So a single address space sync scope
is needed for shared atomic operations (such as "system-one-as" or
"workgroup-one-as") otherwise buffer_wbinvl1 and s_waitcnt instructions
can be created at each shared atomic operation.
This mostly reimplements LLVMBuildAtomicRMW and LLVMBuildAtomicCmpXchg
to allow for more sync scopes and uses the new functions in ac->nir with
the "workgroup-one-as" or "workgroup" sync scopes.
F1 2017 (4K, Ultra High settings, TAA), avg FPS : 59 -> 59.67 (+1.14%)
Strange Brigade (4K, ~highest settings), avg FPS : 51.5 -> 51.6 (+0.19%)
RotTR/mountain (4K, VeryHigh settings, FXAA), avg FPS : 57.2 -> 57.2 (+0.0%)
RotTR/tomb (4K, VeryHigh settings, FXAA), avg FPS : 42.5 -> 43.0 (+1.17%)
RotTR/valley (4K, VeryHigh settings, FXAA), avg FPS : 40.7 -> 41.6 (+2.21%)
Warhammer II/fallen, avg FPS : 31.63 -> 31.83 (+0.63%)
Warhammer II/skaven, avg FPS : 37.77 -> 38.07 (+0.79%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Hal Gentz [Sun, 24 Mar 2019 22:52:39 +0000 (16:52 -0600)]
glx: Fix synthetic error generation in __glXSendError
To quote Uli Schlachter, who understands this stuff more than I do:
> The function __glXSendError() in mesa's src/glx/glx_error.c invents an X11
> protocol error out of thin air. For the sequence number it uses dpy->request.
> This is the sequence number of the last request that was sent. _XError() will
> then update dpy->last_request_read based on the sequence number of the error
> that just "came in".
>
> If now another something comes in with a sequence number less than
> dpy->last_request_read, since sequence numbers are monotonically increasing,
> widen() will incorrectly add 1<<32 to the sequence number and things might go
> downhill afterwards.
`__glXSendErrorForXcb` was also patched, as that's the function that
`glXCreateContextAttribsARB` actually uses.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99781
Cc: mesa-stable@lists.freedesktop.org
Fixes: ad503c41 'apple: Initial import of libGL for OSX from AppleSGLX svn repository'
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Hal Gentz <zegentzy@protonmail.com>
Lionel Landwerlin [Sat, 27 Apr 2019 03:35:32 +0000 (11:35 +0800)]
Revert "anv: limit URB reconfigurations when using blorp"
In commit
0d46e404 ("anv: limit URB reconfigurations when using
blorp") we tried to limit the number of URB reconfiguration by
checking if the last allocation is large enough to fit the blorp
dispatch.
We used the last bound pipeline to compare the allocation. The problem
with this is that the pipeline is bound but its commands might not
have been emitted into the command buffer yet.
Let's just revert commit
0d46e404677264bfb12ada15290e39c10a5eb455
since it didn't seem to yield any performance improvement.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 0d46e404 ("anv: limit URB reconfigurations when using blorp")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110535
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Erik Faye-Lund [Thu, 7 Mar 2019 12:21:50 +0000 (13:21 +0100)]
mesa/st: remove always-false state
This code is essentially dead now.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Erik Faye-Lund [Wed, 6 Mar 2019 11:18:28 +0000 (12:18 +0100)]
mesa/st: accept NULL and empty buffer objects
It's prefectly legal and well-defined to render using a non-existing
or empty buffer object. The data coming out of the buffer object isn't
well defined unless we have the robustness flag set on the context, but
that's a different matter, and up to the shader hardware; it's the same
as out-of-bounds reads.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Erik Faye-Lund [Wed, 6 Mar 2019 12:28:51 +0000 (13:28 +0100)]
swr: support NULL-resources
It's legal for a buffer-object to have a NULL-resource, but let's just
skip over it, as there's nothing to do.
This patch switches the order of the conditionals in swr_update_derived,
so the logic becomes a bit more straight forward:
if (is_user_buffer)
...
else if (resource)
...
else
...
...instead of this:
if (!is_user_buffer)
if (resource)
...
else
...
else
...
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Alok Hota <alok.hota@intel.com>
Erik Faye-Lund [Wed, 6 Mar 2019 12:28:42 +0000 (13:28 +0100)]
nouveau: support NULL-resources
It's legal for a buffer-object to have a NULL-resource, but let's just
skip over it, as there's nothing to do.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Erik Faye-Lund [Wed, 6 Mar 2019 12:28:24 +0000 (13:28 +0100)]
i915: support NULL-resources
It's legal for a buffer-object to have a NULL-resource, but let's just
skip over it, as there's nothing to do.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Erik Faye-Lund [Wed, 6 Mar 2019 12:29:35 +0000 (13:29 +0100)]
gallium/u_vbuf: support NULL-resources
It's legal for a buffer-object to have a NULL-resource, but let's just
skip over it, as there's nothing to do.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Erik Faye-Lund [Tue, 4 Dec 2018 12:52:25 +0000 (13:52 +0100)]
mesa/st: remove impossible error-check
st_setup_current never sets this flag, and it's already checked against
right before. So let's remove this pointless check.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Andres Gomez [Wed, 30 Jan 2019 13:58:42 +0000 (15:58 +0200)]
glsl/linker: check for xfb_offset aliasing
From page 76 (page 80 of the PDF) of the GLSL 4.60 v.5 spec:
" No aliasing in output buffers is allowed: It is a compile-time or
link-time error to specify variables with overlapping transform
feedback offsets."
Currently, this is expected to fail, but it succeeds:
"
...
layout (xfb_offset = 0) out vec2 a;
layout (xfb_offset = 0) out vec4 b;
...
"
Fixes the following piglit test:
tests/spec/arb_enhanced_layouts/compiler/transform-feedback-layout-qualifiers/xfb_offset/invalid-overlap.vert
Fixes the following test:
KHR-GL44.enhanced_layouts.xfb_output_overlapping
v2:
- Use a data structure to track the used components instead of a
nested loop (Ilia).
v3:
- Take the BITSET_WORD array out from the
gl_transform_feedback_buffer struct and make it local to the
validation process (Timothy).
- Do not use a nested scope for the validation (Timothy).
v4:
- Add reference to the fixed piglit test in the commit log.
- Add reference to the fixed VK-GL-CTS test in the commit
log (Tapani).
- Empty initialize the BITSET_WORD pointers array (Tapani).
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Patrick Lerda [Mon, 29 Apr 2019 08:43:51 +0000 (10:43 +0200)]
lima/ppir: fix pointer referenced after a free
Issue detected by valgrind.
Fixes: 92d7ca4b1cd ("gallium: add lima driver")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Eleni Maria Stea [Mon, 29 Apr 2019 07:00:17 +0000 (09:00 +0200)]
radv: consider MESA_VK_VERSION_OVERRIDE when setting the api version
Before setting the physical device API version, we should check if the
MESA_VK_VERSION_OVERRIDE environment variable is set and take it into
account.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Kenneth Graunke [Wed, 3 Apr 2019 21:24:31 +0000 (14:24 -0700)]
intel/fs: Don't emit empty ELSE blocks.
While we can clean this up later, it's trivial to not generate the
stupid code in the first place, which saves some optimization work.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Mon, 8 Apr 2019 18:22:20 +0000 (11:22 -0700)]
nir: Add a new nir_cf_list_is_empty_block() helper.
Helper and name suggested by Eric Anholt.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Mon, 8 Apr 2019 18:10:08 +0000 (11:10 -0700)]
glsl/list: Add an exec_list_is_singular() helper.
Similar to list_is_singular() in util/list.h.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tapani Pälli [Fri, 12 Apr 2019 09:52:43 +0000 (12:52 +0300)]
anv: expose VK_EXT_queue_family_foreign on Android
VK_ANDROID_external_memory_android_hardware_buffer requires this
extension. It is safe to enable it since currently aux usage is
disabled for ahw buffers.
Fixes following dEQP extension dependency test on Android:
dEQP-VK.api.info.device#extensions
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Andreas Baierl [Fri, 26 Apr 2019 13:06:13 +0000 (15:06 +0200)]
lima/ppir: Add gl_FragCoord handling
Treat gl_FragCoord variable as a system value and lower the w component
with a nir pass.
Add the necessary bits for correct codegen.
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Andreas Baierl [Fri, 26 Apr 2019 13:01:43 +0000 (15:01 +0200)]
nir: add rcp(w) lowering for gl_FragCoord
On some hardware (e.g. Mali400) the shader needs to apply some
transformations for correct gl_FragCoord handling. The lowering
actions look like the following in pseudocode:
gl_FragCoord.xyz = gl_FragCoord_orig.xyz
gl_FragCoord.w = 1.0 / gl_FragCoord_orig.w
Add this lowering as a nir pass in preparation for using it in the driver.
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Romain Failliot [Sat, 27 Apr 2019 21:02:21 +0000 (17:02 -0400)]
docs: changed "Done" to "DONE" in features.txt
Mesamatrix.net expects uppercase.
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Alyssa Rosenzweig [Sun, 28 Apr 2019 21:39:20 +0000 (21:39 +0000)]
panfrost: Workaround -bshadow regression
I have *no* idea what's happening here, but let's not regress an app
that used to work in the mean time while we're figuring it out..
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Alyssa Rosenzweig [Sun, 28 Apr 2019 15:46:47 +0000 (15:46 +0000)]
panfrost/midgard: Safety check immediate precision degradations
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Alyssa Rosenzweig [Sun, 28 Apr 2019 15:21:34 +0000 (15:21 +0000)]
panfrost: Use fp32 (not fp16) varyings
In a perfect world, we'd use fp16 varyings for mediump and fp32 for
highp, allowing us to get a performance win without sacrificing
conformance. Unfortunately, we're not there (yet), so it's better we
assume always fp32 than always fp16 to avoid artefacts / breaking a lot
of deqp.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Alyssa Rosenzweig [Sun, 28 Apr 2019 04:58:43 +0000 (04:58 +0000)]
panfrost/midgard: imov workaround
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Alyssa Rosenzweig [Sun, 28 Apr 2019 04:38:01 +0000 (04:38 +0000)]
panfrost/midgard: Fix tex propogation
Unbreaks mpv.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Alyssa Rosenzweig [Sun, 28 Apr 2019 03:47:57 +0000 (03:47 +0000)]
panfrost/midgard: Fix regressions in -bjellyfish
Two fixes here, one is that we tried to copyprop non-strictly-SSA values
which was bound to fly in our face. The other was peeling back the imov
workaround.. Turns out we still need that. More research is needed
still, but let's not regress real apps.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Alyssa Rosenzweig [Sat, 27 Apr 2019 23:49:52 +0000 (23:49 +0000)]
panfrost/midgard: Only copyprop without an outmod
With an outmod, we would need to propagate that through, which is for
future work.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Alyssa Rosenzweig [Sat, 27 Apr 2019 23:46:38 +0000 (23:46 +0000)]
Revert "panfrost/midgard: Extend copy propagation pass"
Fixes: commit b53b4573c3f0571253672e44ce7d6310d9f987bf.
Optimization gone wrong. In the future, we should try this again (it's a
net win if implemented right), but at the moment this just regresses.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Samuel Pitoiset [Sat, 27 Apr 2019 10:07:51 +0000 (12:07 +0200)]
radv: add missing VEGA20 chip in radv_get_device_name()
Otherwise it returns "AMD RADV unknown".
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Kenneth Graunke [Sat, 27 Apr 2019 07:24:05 +0000 (00:24 -0700)]
iris: Fix zeroing of transform feedback offsets in strange cases.
Some of the dEQP.functional.transform_feedback tests end up doing
the following sequence of operations:
1. BeginTransformFeedback
2. PauseTransformFeedback
3. Draw
4. ResumeTransformFeedback
At step 1, we'd pack 3DSTATE_SO_BUFFER commands saying to zero the
SO_WRITE_OFFSET registers. At step 2, we disable streamout, so step 3
doesn't bother emitting those commands. Then, step 4 re-packs new
3DSTATE_SO_BUFFER commands with offset = 0xFFFFFFFF, saying to continue
appending at the existing offset. This loads the value from the BO as
the offsets - but we never actually zeroed it.
So, just maintain a flag saying "we actually emitted the commands",
and stomp offset back to zero until we emit some.
Eric Anholt [Sat, 27 Oct 2018 01:19:37 +0000 (18:19 -0700)]
vc4: Fall back to renderonly if the vc4 driver doesn't have v3d.
I have a platform with vc4 display but V3D 4.x. We can fall back on
kmsro's probing to bring up the v3d gallium driver.
Acked-by: Rob Clark <robdclark@chromium.org>
Eric Anholt [Wed, 3 Apr 2019 22:40:22 +0000 (15:40 -0700)]
kmsro: Add support for V3D.
Like vc4, we expect to have SOCs with various displays that have a single
V3D instance for rendering.
v2: Add v3d to the list of drivers that make enabling kmsro valid.
Acked-by: Rob Clark <robdclark@chromium.org>
Marek Olšák [Thu, 25 Apr 2019 23:42:25 +0000 (19:42 -0400)]
radeonsi: don't ignore PIPE_FLUSH_ASYNC
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Eric Anholt [Thu, 25 Apr 2019 19:58:12 +0000 (12:58 -0700)]
v3d: Fix detection of TMU write sequences in register spilling.
We can't use the QPU functions to detect this until register allocation is
done and we've moved inst->dst into inst->qpu.
Fixes bad TMU sequences from register spilling in
KHR-GLES31.core.compute_shader.shared-max.
Eric Anholt [Thu, 25 Apr 2019 19:29:55 +0000 (12:29 -0700)]
v3d: Fix detection of the last ldtmu before a new TMU op.
We were looking at the start instruction, instead of scanning through the
list of following instructions to find any more ldtmus.
Eric Anholt [Thu, 25 Apr 2019 18:30:39 +0000 (11:30 -0700)]
v3d: Re-add support for memory_barrier_shared.
Looks like I lost it in a rebase conflict resolution. We'd hit the
unknown intrinsic assertion in
KHR-GLES31.core.compute_shader.shared-struct.
Fixes: 6b1c65982509 ("v3d: Add Compute Shader compilation support.")
Eric Anholt [Tue, 23 Apr 2019 18:10:56 +0000 (11:10 -0700)]
Revert "v3d: Disable PIPE_CAP_BLIT_BASED_TEXTURE_TRANSFER."
This reverts commit
ccce9409470c1053c40c822d759b9bd417062bc0, leaving a
note as to why we had to (corruption in chromium, breaking some GLES3.1
tests).
Eric Anholt [Mon, 22 Apr 2019 18:04:32 +0000 (11:04 -0700)]
v3d: Don't try to update the shadow texture for separate stencil.
There are two cases where v3d's sampler view's resource doesn't match the
base's: shadow textures for sampling from raster, and pointing at the
separate depth texture for z32f_s8x24. We only want to update shadow for
the first case.
Fixes
dEQP-GLES31.functional.stencil_texturing.render.depth32f_stencil8_draw
when run after the previous testcase.
Eric Anholt [Mon, 22 Apr 2019 17:40:47 +0000 (10:40 -0700)]
v3d: Add a note about i/o indirection for future performance work.
Eric Anholt [Fri, 19 Apr 2019 21:28:40 +0000 (14:28 -0700)]
vc4: Use _mesa_hash_table_remove_key() where appropriate.
Eric Anholt [Fri, 19 Apr 2019 21:26:42 +0000 (14:26 -0700)]
v3d: Use _mesa_hash_table_remove_key() where appropriate.
Eric Anholt [Thu, 25 Apr 2019 18:23:55 +0000 (11:23 -0700)]
v3d: Assert that we do request the normal texturing return data.
An unused tex should be DCEed, but if it wasn't we'd run into trouble with
not doing a TMUWT.
Eric Anholt [Wed, 24 Apr 2019 18:26:34 +0000 (11:26 -0700)]
v3d: Apply the GFXH-930 workaround to the case where the VS loads attrs.
We were emitting a dummy load for when the VS doesn't load any attributes,
but we also need to emit a dummy load for when the render VS loads
attributes but the binner VS doesn't. Fixes simulator assertion failures
and GPU hangs on KHR-GLES31.core.texture_gather.\*
Eric Anholt [Thu, 25 Apr 2019 00:25:08 +0000 (17:25 -0700)]
v3d: Fill in the ignored segment size fields to appease new simulator.
We are assured that the input segment size field is ignored for
!separate_segs mode, and now the simulator wants an in-range value set
regardless of whether it's functionally ignored or not.
Tapani Pälli [Tue, 23 Apr 2019 11:31:44 +0000 (14:31 +0300)]
glsl: use empty brace initializer
fixes following warning with clang:
warning: suggest braces around initialization of subobject
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
coypu [Sun, 7 Apr 2019 20:31:37 +0000 (23:31 +0300)]
gbm: don't return void
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Tapani Pälli [Tue, 23 Apr 2019 11:35:17 +0000 (14:35 +0300)]
nir: use braces around subobject in initializer
Used same syntax as elsewhere with Mesa sources, verified result
against MSVC with godbolt.org.
fixes following warning with clang:
warning: suggest braces around initialization of subobject
v2: empty braces -> braces around subobject (Caio, Kristian)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Kristian H. Kristensen [Fri, 26 Apr 2019 18:37:36 +0000 (11:37 -0700)]
freedreno/drm: Quiet pointer to u64 conversion warning
Alok Hota [Mon, 22 Oct 2018 16:53:38 +0000 (11:53 -0500)]
swr/rast: enforce use of tile offsets
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Alok Hota [Tue, 19 Jun 2018 22:22:32 +0000 (17:22 -0500)]
swr/rast: AVX512 support compiled in by default
- Emulation of AVX512 built into SIMDLIB
- Remove associated macros
- Remove knobs controlling AVX512 and let emulation handle it
- Refactor variable names for SIMD16
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Alok Hota [Thu, 14 Jun 2018 17:30:56 +0000 (12:30 -0500)]
swr/rast: Remove deprecated 4x2 backend code
- Use 8x2 tiling by default
- Remove associated macros
- Use SIMDLIB emulation for SIMD16 on SIMD8 hardware
- Remove code rot in Load/StoreTile
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tomasz Figa [Thu, 25 Apr 2019 17:42:04 +0000 (18:42 +0100)]
llvmpipe: Always return some fence in flush (v2)
If there is no last fence, due to no rendering happening yet, just
create a new signaled fence and return it, to match the expectations of
the EGL sync fence API.
Fixes random "Could not create sync fence 0x3003" assertion failures from
Skia on Android, coming from the following code:
https://android.googlesource.com/platform/frameworks/base/+/master/libs/hwui/pipeline/skia/SkiaOpenGLPipeline.cpp#427
Reproducible especially with thread count >= 4.
One could make the driver always keep the reference to the last fence,
but:
- the driver seems to explicitly destroy the fence whenever a rendering
pass completes and changing that would require a significant functional
change to the code. (Specifically, in lp_scene_end_rasterization().)
- it still wouldn't solve the problem of an EGL sync fence being created
and waited on without any rendering happening at all, which is
also likely to happen with Android code pointed to in the commit.
Therefore, the simple approach of always creating a fence is taken,
similarly to other drivers, such as radeonsi.
Tested with piglit llvmpipe suite with no regressions and following
tests fixed:
egl_khr_fence_sync
conformance
eglclientwaitsynckhr_flag_sync_flush
eglclientwaitsynckhr_nonzero_timeout
eglclientwaitsynckhr_zero_timeout
eglcreatesynckhr_default_attributes
eglgetsyncattribkhr_invalid_attrib
eglgetsyncattribkhr_sync_status
v2:
- remove the useless lp_fence_reference() dance (Nicolai),
- explain why creating the dummy fence is the right approach.
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Emil Velikov [Thu, 25 Apr 2019 17:42:03 +0000 (18:42 +0100)]
llvmpipe: correctly handle waiting in llvmpipe_fence_finish
Currently if the timeout differs from 0, we'll end up with infinite
wait... even if the user is perfectly clear they don't want that.
Use the new lp_fence_timedwait() helper guarding both waits in an
!lp_fence_signalled block like the rest of llvmpipe.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Emil Velikov [Thu, 25 Apr 2019 17:42:02 +0000 (18:42 +0100)]
llvmpipe: add lp_fence_timedwait() helper
The function is analogous to lp_fence_wait() while taking at timeout
(ns) parameter, as needed for EGL fence/sync.
v2:
- use absolute UTC time, as per spec (Gustaw)
- bail out on cnd_timedwait() failure (Gustaw)
v3:
- check count/rank under mutex (Gustaw)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1)
Reviewed-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Emil Velikov [Fri, 19 Apr 2019 11:11:00 +0000 (12:11 +0100)]
vulkan/wsi: don't use DUMB_CLOSE for normal GEM handles
Currently we get normal GEM handles from PrimeFDToHandle, yet we close
then with DUMB_CLOSE. Use GEM_CLOSE instead.
Fixes: da997ebec92 ("vulkan: Add KHR_display extension using DRM [v10]")
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Keith Packard <keithp@keithp.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Wed, 17 Apr 2019 16:49:09 +0000 (17:49 +0100)]
vulkan/wsi: check if the display_fd given is master
As effectively required by the extension, we need to ensure we're master
Currently drivers employ vendor specific solutions, which check if the
device behind the fd is capable*, yet none of them do the master check.
*In the radv case, if acceleration is available.
Instead of duplicating the check in each driver, keep it where it's
needed and used.
Note this copies libdrm's drmIsMaster() to avoid depending on bleeding
edge version of the library.
v2: set the fd to -1 if not master (Bas)
Fixes: da997ebec92 ("vulkan: Add KHR_display extension using DRM [v10]")
Cc: Andres Rodriguez <andresx7@gmail.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Keith Packard <keithp@keithp.com>
Reported-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Fri, 19 Apr 2019 10:47:34 +0000 (11:47 +0100)]
turnip: drop dead close(master_fd)
The fd is -1, thus the block of if (fd != -1) close(fd) is dead code.
Cc: Chad Versace <chadversary@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Jason Ekstrand [Mon, 22 Apr 2019 23:03:04 +0000 (18:03 -0500)]
nir/algebraic: Optimize integer cast-of-cast
These have been popping up more and more with the OpenCL work and other
bits causing extra conversions to/from 64-bit.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Jason Ekstrand [Thu, 25 Apr 2019 19:15:26 +0000 (14:15 -0500)]
anv/descriptor_set: Don't fully destroy sets in pool destroy/reset
In
105002bd2d617, we fixed a memory leak bug where we weren't properly
destroying descriptor when destroying/resetting a descriptor pool.
However, the only real leak that happened was that we we take a
reference to the descriptor set layout in the descriptor set and we
weren't dropping our reference. Everything else in the descriptor set
is tied to the pool itself and doesn't need to be freed on a per-set
basis. This commit changes the destroy/reset functions to only bother
walking the list of sets to unref the layouts and otherwise we just
assume that the whole-pool destroy/reset takes care of the rest.
Now that we're doing more non-trivial things with descriptor sets such
as allocating things with util_vma_heap, per-set destruction is starting
to show up on perf traces. This takes reset back to where it's supposed
to be as a cheap whole-pool operation.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Fri, 26 Apr 2019 03:23:23 +0000 (22:23 -0500)]
anv: Better handle 32-byte alignment of descriptor set buffers
In
c520f4dec9c, we chose to align the sizes of descriptor set buffers to
32 bytes. We have to align the descriptor set buffer to 32B so that
it's valid for using with push constants. We align the size as well so
we don't leave lots of holes with util_vma_heap_alloc. Unfortunately,
we were only aligning it for alloc and not for free so we were still
creating piles of holes when we delete descriptor sets. This causes
terrible perf for the allocator once we've deleted piles of descriptor
sets.
This commit reworks the code so that we align the descriptor set buffer
size to 32B for both alloc and free. The result is that it takes the
new crucible vkResetDescriptorPool from 104.567719 to 2.898354 seconds.
Fixes: c520f4dec9c "anv: Add a concept of a descriptor buffer"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110497
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Dave Airlie [Fri, 26 Apr 2019 02:50:11 +0000 (12:50 +1000)]
nir: fix bit_size in lower indirect derefs.
This fixes a case where we are expecting 64-bit but generate
32-bit consts and validate gets angry.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Kenneth Graunke [Fri, 26 Apr 2019 00:33:36 +0000 (17:33 -0700)]
iris: Silence unused function warning
Marek Olšák [Mon, 8 Apr 2019 21:20:13 +0000 (17:20 -0400)]
glsl: fix shader_storage_blocks_write_access for SSBO block arrays (v2)
This fixes KHR-GL45.compute_shader.resources-max on radeonsi.
Fixes: 4e1e8f684bf "glsl: remember which SSBOs are not read-only and pass it to gallium"
v2: use is_interface_array, protect again assertion failures in u_bit_consecutive
Reviewed-by: Dave Airlie <airlied@redhat.com>
Rob Clark [Thu, 25 Apr 2019 22:48:19 +0000 (15:48 -0700)]
docs/features: update GL too
Forgot to update corresponding entries for desktop GL.. kinda wish we
didn't have to update both GLES and GL tables.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Rob Clark [Thu, 25 Apr 2019 19:28:35 +0000 (12:28 -0700)]
freedreno/a6xx: sample-shading support
Enables:
OES_sample_shading
OES_sample_variables
OES_shader_multisample_interpolation
Signed-off-by: Rob Clark <robdclark@chromium.org>
Rob Clark [Thu, 25 Apr 2019 19:25:02 +0000 (12:25 -0700)]
freedreno/ir3: sample-shading support
The compiler support for:
OES_sample_shading
OES_sample_variables
OES_shader_multisample_interpolation
Signed-off-by: Rob Clark <robdclark@chromium.org>
Rob Clark [Thu, 25 Apr 2019 19:22:30 +0000 (12:22 -0700)]
freedreno: wire up core sample-shading support
Signed-off-by: Rob Clark <robdclark@chromium.org>
Rob Clark [Thu, 25 Apr 2019 19:20:07 +0000 (12:20 -0700)]
freedreno/ir3: fix load_interpolated_input slot
The so->inputs[] table is in units of vec4
Fixes: 7ff6705b8d8 freedreno/ir3: convert to "new style" frag inputs
Signed-off-by: Rob Clark <robdclark@chromium.org>
Rob Clark [Thu, 25 Apr 2019 19:13:35 +0000 (12:13 -0700)]
freedreno/a6xx: add VALIDREG/CONDREG helper macros
There are a few places that we check if a shader stage input reg is
used/valid (ie. not r63.x).. and there are about to be a bunch more.
So add some helper macros for less open-coding.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Rob Clark [Thu, 25 Apr 2019 19:00:49 +0000 (12:00 -0700)]
freedreno/ir3: rename frag_vcoord -> ij_pixel
Since this is what the value actually is. Cleanup the name before
adding more different i,j related values for sample-shading.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Rob Clark [Thu, 25 Apr 2019 18:54:34 +0000 (11:54 -0700)]
freedreno/ir3: remove bogus assert
tex instruction can actually return 16b values.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Rob Clark [Fri, 19 Apr 2019 18:15:40 +0000 (11:15 -0700)]
freedreno/ir3: lower load_barycentric_at_offset
Calculates i,j at specified offset within a pixel. A new load_size_ir3
intrinsic is used in conjunction with fddx/fddy to translate the offset
into primitive space and adjust the i,j from load_barycentric_pixel
accordingly.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Rob Clark [Fri, 19 Apr 2019 18:12:34 +0000 (11:12 -0700)]
freedreno/ir3: lower load_barycentric_at_sample
This lowers load_barycentric_at_sample to load_sample_pos_from_id plus
load_barycentric_at_offset.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Rob Clark [Fri, 19 Apr 2019 18:10:49 +0000 (11:10 -0700)]
freedreno: update generated headers
Pull in updates for sample shading.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Rob Clark [Fri, 5 Apr 2019 22:21:05 +0000 (18:21 -0400)]
freedreno/ir3: cleanup instruction builder macros
De-duplicate the "normal" and "flags" versions of the macros, and while
at it go ahead and add "flags" versions for all the remaining macros,
since we'll at least need INSTR1F in a following commit.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Rob Clark [Wed, 17 Apr 2019 23:24:15 +0000 (16:24 -0700)]
freedreno/ir3: more emit-cat5 fixes
Couple more opcodes which don't take a sampler id as first arg.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Rob Clark [Fri, 5 Apr 2019 17:09:33 +0000 (13:09 -0400)]
freedreno/ir3: fix rgetpos decoding
It takes an argument.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Rob Clark [Wed, 3 Apr 2019 23:55:34 +0000 (19:55 -0400)]
compiler: rename SYSTEM_VALUE_VARYING_COORD
And add corresponding enums for different sorts of varying
interpolation.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Rob Clark [Tue, 16 Apr 2019 18:10:01 +0000 (11:10 -0700)]
freedreno: add robustness support
Signed-off-by: Rob Clark <robdclark@chromium.org>
Rob Clark [Tue, 16 Apr 2019 17:10:05 +0000 (10:10 -0700)]
freedreno/drm: update for robustness
Update UABI header and add FD_PP_PGTABLE and FD_NR_FAULTS params.
Robustness can be supported by a kernel which provides the new ABI if it
also indicates that per-process pagetables are in use.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Alyssa Rosenzweig [Thu, 25 Apr 2019 04:38:32 +0000 (04:38 +0000)]
panfrost/midgard: Add new bitwise ops
These fused NOT-ops could maybe help somehow...?
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Alyssa Rosenzweig [Thu, 25 Apr 2019 04:25:33 +0000 (04:25 +0000)]
panfrost/midgard: Identify inand
This was previously thought to be inot, but it's actually a bit more
general than that! :)
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>