Dave Airlie [Tue, 14 Nov 2017 23:59:42 +0000 (09:59 +1000)]
r600: add core pieces of image support.
This adds the atoms and gallium api implementations,
along with support for compress/decompress paths for
shader images.
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Tue, 14 Nov 2017 23:51:36 +0000 (09:51 +1000)]
r600/shader: implement getting thread id.
We need the thread id to use the immediate buffer readback
mechanism, so add support for calculating it.
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Tue, 14 Nov 2017 23:54:24 +0000 (09:54 +1000)]
r600/shader: add flag to denote if shader uses images
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Tue, 14 Nov 2017 23:48:29 +0000 (09:48 +1000)]
r600: implement basic memory barrier.
This isn't 100% perfect (fglrx also fails a bunch of those tests)
but implement the start of a memory barrier for image support.
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Tue, 14 Nov 2017 23:47:03 +0000 (09:47 +1000)]
r600: allocate immed buffer resource for images.
In order to image readback we have to execute a MEM_RAT instruction
that needs a buffer to transfer the result into until the shader
can fetch it.
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Tue, 14 Nov 2017 23:46:01 +0000 (09:46 +1000)]
r600: handle writes_memory properly
This implements proper handling for shaders with side effects.
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dylan Baker [Tue, 7 Nov 2017 00:38:06 +0000 (16:38 -0800)]
autotools: change version TINY -> PATCH
Because patch is more common than tiny for talking about the 3rd element
of a version.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Emil Velikov <emli.velikov@collabora.com>
Dylan Baker [Mon, 30 Oct 2017 23:52:29 +0000 (16:52 -0700)]
autotools: set XA versions in configure.ac and configure header file
Currently the versions are set in the header, and then sed is used to
extract them, so that autotools can use them elsewhere.
This is odd. Autotools is perfectly capable of configuring the header
with the versions, and then they don't need to be extracted from the
the header. This is cleaner and more obvious.
Tested with make distcheck.
v2: - Split tiny -> patch change
- Drop temporary variables
- change XA_VERSION_* -> XA_*
v3: - Finish splitting the tiny -> patch change
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Emil Velikov <emli.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com> (v2)
Kenneth Graunke [Thu, 16 Nov 2017 07:06:27 +0000 (23:06 -0800)]
genxml: Fix PIPELINE_SELECT on G45/Ironlake.
Original 965 sets bits 28:27 to 0, while G45 and later set it to 1.
Note that the G45 docs are incorrect in this regard - see the DevCTG+
note in the Ironlake PRMs.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Emil Velikov [Thu, 16 Nov 2017 18:33:22 +0000 (18:33 +0000)]
egl: pass the dri2_dpy to the $plat_teardown functions
Cc: Mark Janes <mark.a.janes@intel.com>
Fixes: 40a01c9a0ef ("egl/drm: move teardown code to the platform file")
Fixes: 8d745abc009 ("egl/wayland: move teardown code to the platform file")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Dylan Baker <dylan@pnwbakers.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103784
Rafael Antognolli [Wed, 15 Nov 2017 17:32:47 +0000 (09:32 -0800)]
meson: Add dridriverdir variable to dri.pc.
Xorg (and possibly other things) depend on this variable to find the
path to DRI drivers.
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Dylan Baker [Tue, 17 Oct 2017 19:19:49 +0000 (12:19 -0700)]
docs: add documentation for building with meson
v2: - Add information about CC, CXX, CFLAGS, and CXXFLAGS (Nicolai)
- Add message at top that meson for mesa is still a work in progress
- Add trailing "/" to directories (Eric E.)
- Fix a number of spelling/grammar/style suggestions from Eric E.
- Make a number of changes as suggested by Emil.
v3: - Fix order of commands in example (Eric E.)
- Add documentation for overriding LLVM version (Eric E.)
v4: - Rebase on master
- update default buildtype
- add note about b_ndebug
- Clarify meson configure a bit
v5: - use <code> for command line arguments (Eric E.)
- Add note about listing options without a build directory
- Minor formatting changes (Eric E.)
- Replace the CC, CFLAGS, etc section with an environment variables
section, which mentions CC, CXX, CFLAGS, CXXFLAGS, LDFLAGS, and
DESTDIR
- Add comment that not using buildtype debug might make debugging
harder
- Add comment that b_ndebug and buildtype are orthogonal
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch> (v3)
Kai Wasserbäch [Thu, 16 Nov 2017 11:58:50 +0000 (12:58 +0100)]
docs: Point to apt.llvm.org for development snapshot packages
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Eric Engestrom [Thu, 16 Nov 2017 10:02:15 +0000 (10:02 +0000)]
egl: fix var type
queryImage() takes an `int*`; compiler is warning about the
signed<->unsigned pointer mismatch.
Fixes: 0db36caa192b129cb4f2 "egl/wayland: Add a fallback when fourcc
query isn't supported"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Derek Foreman <derekf@osg.samsung.com>
Emil Velikov [Thu, 16 Nov 2017 15:51:49 +0000 (15:51 +0000)]
i915: add missing extensions.h include
Otherwise we'll bail with due to -Werror=implicit-function-declaration.
It went unnoticed since the we had a bug which did consistently set the
compiler flag.
Fixes: ba8a347f932 ("mesa: split extensions overrides and glGetString(GL_EXTENSIONS)")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Emil Velikov [Mon, 6 Nov 2017 18:01:36 +0000 (18:01 +0000)]
mesa: return 'unrecognized' extensions in glGetStringi
Analogous to the glGetString() case - report all the
extensions enabled via MESA_EXTENSION_OVERRIDE
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Emil Velikov [Mon, 6 Nov 2017 16:14:51 +0000 (16:14 +0000)]
mesa: rework the way we manage extra_extensions
Store pointers to the tokenized strings in the gl_extensions struct.
This way we can reuse them in glGetStringi() while we construct the
really long string only in _mesa_make_extension_string.
Only 16 pointers/strings are stored for now.
v2: Warn only once when we provide more than 16 unk. extensions, rebase
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Emil Velikov [Mon, 6 Nov 2017 17:58:08 +0000 (17:58 +0000)]
mesa: pass the ctx to _mesa_one_time_init_extension_overrides
Will be needed with next commit
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Emil Velikov [Mon, 6 Nov 2017 16:02:32 +0000 (16:02 +0000)]
mesa: call atexit() only as needed
If the extra_extensions string is empty there's no need to call
atexit() - there's nothing to free.
v2: Rebase
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Emil Velikov [Wed, 25 Oct 2017 10:33:09 +0000 (11:33 +0100)]
mesa: remove unnecessary 'sort by year' for the GL extensions
The sorting was originally added to work around broken games (comment
says Quake3 demo) that were copying the extensions list into small
buffer.
Sorting does not solve the problem, since we'll still overflow and cause
corruption/crash.
Better workaround is to actually trim the string ... as done with a
later commit which introduces the MESA_EXTENSION_MAX_YEAR env. variable.
Side note: On my machine, the existing sorting makes no changes to the
extensions string.
Cc: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Emil Velikov [Tue, 24 Oct 2017 14:57:08 +0000 (15:57 +0100)]
mesa: reuse set_extension() for _mesa_extension_override_disables
We already use it for _mesa_extension_override_enables.
Improve consistency and use it for both extension lists.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Emil Velikov [Tue, 24 Oct 2017 13:35:41 +0000 (14:35 +0100)]
mesa: drop unnecessary coping of extra_extensions
The function get_extension_override() returns a copy of a string,
only for it to be copied again ...
Drop the unneeded calloc/strdup/free dance.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Emil Velikov [Tue, 24 Oct 2017 14:47:41 +0000 (15:47 +0100)]
mesa: remove duplicate 'disabled extensions' list
While parsing MESA_EXTENSION_OVERRIDE we keep track of the disabled
extensions, twice - in _mesa_extension_override_disables and
disabled_extensions.
Upon context creation, we use the former to modify the extensions list.
Yet, we still check the updated list against disabled_extensions.
Remove disabled_extensions, it's obsolete.
Cc: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Emil Velikov [Mon, 6 Nov 2017 15:33:52 +0000 (15:33 +0000)]
mesa: call _mesa_make_extension_string only as needed
As of previous commit we removed the extension overrides from this
function.
Thus we no longer need to call it during MakeCurrent, so we can
construct the extensions string when needed - _mesa_GetString.
This commit effectively reverts
a879d14ecf8 ("mesa: initialize extension
string when context is first bound")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Emil Velikov [Mon, 6 Nov 2017 15:20:35 +0000 (15:20 +0000)]
mesa: split extensions overrides and glGetString(GL_EXTENSIONS)
Currently we apply the extension overrides and construct the extensions
string upon MakeCurrent.
They are two distinct things, so let's slit the two while pushing the
overrides management _before_ _mesa_compute_version(). This ensures that
the version is updated to reflect the enabled/disabled extensions.
Cc: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Emil Velikov [Tue, 24 Oct 2017 14:21:40 +0000 (15:21 +0100)]
i965: remove ARB_compute_shader extension override
Checking the override was useful in the early stages of developing the
extension.
Now that everything is wired, where possible, we can drop the check.
Doing so allows us to simplify some of the related code.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Emil Velikov [Tue, 24 Oct 2017 10:58:56 +0000 (11:58 +0100)]
i965: use _mesa_is_desktop_gl helper
Use the helper over opencoding the check.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Emil Velikov [Mon, 13 Nov 2017 14:02:56 +0000 (14:02 +0000)]
egl: add note about missing $plat_teardown
Some platforms are missing a proper teardown function. Add a small TODO
to make it obvious.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Emil Velikov [Thu, 9 Nov 2017 19:13:09 +0000 (19:13 +0000)]
egl/wayland: move teardown code to the platform file
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Emil Velikov [Thu, 9 Nov 2017 19:04:25 +0000 (19:04 +0000)]
egl/drm: move teardown code to the platform file
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Emil Velikov [Thu, 9 Nov 2017 18:58:52 +0000 (18:58 +0000)]
egl/x11: move teardown code to the platform file
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Emil Velikov [Thu, 9 Nov 2017 17:55:19 +0000 (17:55 +0000)]
egl: Provide meaningfull error when built w/o requested platform
The current "No EGL platform enabled." is misleading and wrong.
We reach said code when $platform is missing.
To make this more obvious and clear provide wrappers in the header
file, making the code a bit easier to follow.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Jon Turney [Mon, 13 Nov 2017 10:13:39 +0000 (10:13 +0000)]
meson: Don't define HAVE_PTHREAD only on linux
I'm not sure of the reason for this. I don't see anything like this in
configure.ac
In include/c11/threads.h the cases are:
1) building for Windows -> threads_win32.h
2) HAVE_PTHREAD -> threads_posix.h
3) Not supported on this platform
So not defining HAVE_PTHREAD for anything not Windows just means we can't
build at all.
When we are building for Windows, I'm not sure if dependency('threads')
would ever find anything, or defining HAVE_PTHREAD has any effect, but avoid
defining it there, just in case.
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Rob Clark [Thu, 16 Nov 2017 13:37:59 +0000 (08:37 -0500)]
freedreno: also mark images used by draw/grid
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Thu, 16 Nov 2017 13:32:32 +0000 (08:32 -0500)]
freedreno: mark SSBOs written at draw time
Comment was right, implementation was wrong ;-)
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Wed, 15 Nov 2017 14:56:38 +0000 (09:56 -0500)]
freedreno/a5xx: ARB_framebuffer_no_attachments support
Signed-off-by: Rob Clark <robdclark@gmail.com>
Kenneth Graunke [Tue, 14 Nov 2017 23:24:36 +0000 (15:24 -0800)]
i965: Implement another VF cache invalidate workaround on Gen8+.
...and provide a better citation for the existing one.
v2:
- Apply the workaround to Gen8 too, as intended (caught by Topi).
- Restructure to add bits instead of an extra flush (based on a similar
patch by Rafael Antognolli).
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Nicolai Hähnle [Wed, 15 Nov 2017 18:34:00 +0000 (19:34 +0100)]
tgsi/exec: fix LDEXP in softpipe
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103128
Fixes: cad959d90145 ("gallium: add LDEXP TGSI instruction and corresponding cap")
Reviewed-by: Brian Paul <brianp@vmware.com>
Nicolai Hähnle [Wed, 15 Nov 2017 11:41:58 +0000 (12:41 +0100)]
threads,configure.ac,meson.build: define and use HAVE_TIMESPEC_GET
Tested with Travis and Appveyor.
v2: add HAVE_TIMESPEC_GET for non-Windows Scons builds
v3: use check_functions in Scons (Eric)
Cc: Rob Herring <robh@kernel.org>
Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103674
Fixes: f1a364878431 ("threads: update for late C11 changes")
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk> (v2)
Timothy Arceri [Wed, 8 Nov 2017 04:43:16 +0000 (15:43 +1100)]
radeonsi: copy some nir gs info
v2: copy input primitive
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Thu, 9 Nov 2017 04:23:23 +0000 (15:23 +1100)]
ac: add gs_{prim,invocation}_id to the abi
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Mon, 6 Nov 2017 11:28:21 +0000 (22:28 +1100)]
radeonsi: gather stream info in nir path
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Vinson Lee [Wed, 15 Nov 2017 01:16:32 +0000 (17:16 -0800)]
mapi: Use correct shared libraries suffix on macOS.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Brian Paul [Wed, 15 Nov 2017 05:17:49 +0000 (22:17 -0700)]
tgsi: whitespace clean-ups in tgsi_util.[ch]
Trivial.
Brian Paul [Tue, 14 Nov 2017 17:51:10 +0000 (10:51 -0700)]
svga: s/unsigned/enum tgsi_texture_type/
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Brian Paul [Tue, 14 Nov 2017 17:50:59 +0000 (10:50 -0700)]
tgsi: s/unsigned/enum tgsi_texture_type/
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Frank Richter [Tue, 17 Oct 2017 08:34:27 +0000 (10:34 +0200)]
gallium/wgl: fix default pixel format issue
When creating a context without SetPixelFormat() don't blindly take the
pixel format reported by GDI. Instead, look for our own closest pixel
format.
Minor clean-ups added by Brian Paul.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103412
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
Brian Paul [Fri, 10 Nov 2017 20:13:46 +0000 (13:13 -0700)]
svga: issue debug warning for unsupported two-sided stencil state
We only have a single stencil read mask and write mask. Issue a
warning if different front/back values are used. The Piglit
gl-2.0-two-sided-stencil test hits this.
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Brian Paul [Fri, 10 Nov 2017 18:05:01 +0000 (11:05 -0700)]
st/mesa: whitespace fixes in st_manager.c
Trivial.
Brian Paul [Fri, 10 Nov 2017 17:58:28 +0000 (10:58 -0700)]
st/mesa: whitespace clean-ups in st_context.c
Trivial.
Brian Paul [Fri, 10 Nov 2017 18:00:22 +0000 (11:00 -0700)]
st/mesa: move st_manager_destroy() earlier in file
To avoid forward declaration.
Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
Brian Paul [Fri, 10 Nov 2017 17:54:11 +0000 (10:54 -0700)]
st/mesa: move st_init_driver_flags() earlier in file
To get rid of forward declaration.
Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
Brian Paul [Fri, 10 Nov 2017 17:24:36 +0000 (10:24 -0700)]
docs: update llvmpipe.html build instructions
Wladimir J. van der Laan [Tue, 14 Nov 2017 09:21:23 +0000 (10:21 +0100)]
etnaviv: Add sampler TS support
Sampler TS is an hardware optimization that can be used when rendering
to textures. After rendering to a resource with TS enabled, the
texture unit can use this to bypass lookups to empty tiles. This also
means a resolve-in-place can be avoided to flush the TS.
This commit is also an optimization when not using sampler TS, as
resolve-in-place will now be skipped if a resource has no (valid) TS.
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Wladimir J. van der Laan [Tue, 14 Nov 2017 09:21:22 +0000 (10:21 +0100)]
etnaviv: Flush TS cache before changing TS configuration
This is to make sure that the TS is properly flushed to memory before
rendering to a new surface starts.
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Wladimir J. van der Laan [Tue, 14 Nov 2017 09:21:21 +0000 (10:21 +0100)]
etnaviv: Add TS_SAMPLER formats to etnaviv_format
Sampler TS introduces yet another format enumeration for
renderable+textureable formats. Introduce it into the etnaviv_format
table as another column.
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Wladimir J. van der Laan [Tue, 14 Nov 2017 09:21:20 +0000 (10:21 +0100)]
etnaviv: Check that resource has a valid TS in etna_resource_needs_flush
Resources only need a resolve-to-itself if their TS is valid for any
level, not just if it happens to be allocated.
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Wladimir J. van der Laan [Tue, 14 Nov 2017 09:21:19 +0000 (10:21 +0100)]
etnaviv: rnndb update
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Dave Airlie [Tue, 14 Nov 2017 03:23:00 +0000 (13:23 +1000)]
radv: it isn't an error to not support a format or driver
This reverts two of the vk_error changes:
reporting unsupported format is common,
and testing non-amdgpu drivers and ignoring them is also common.
Fixes: cd64a4f70 (radv: use vk_error() everywhere an error is returned)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Kenneth Graunke [Tue, 14 Nov 2017 07:52:33 +0000 (23:52 -0800)]
i965: Drop some reserved space remnants.
BATCH_RESERVED was deleted in commit
2c46a67b4138631217 (i965: Delete
BATCH_RESERVED handling.) The reserved_space field is dead code, and
the comments aren't useful these days.
Kenneth Graunke [Tue, 14 Nov 2017 07:48:37 +0000 (23:48 -0800)]
intel: Drop mtypes.h include from brw_compiler.h.
This isn't necessary and causes trouble for a project I'm working on.
Kenneth Graunke [Fri, 3 Nov 2017 21:52:05 +0000 (14:52 -0700)]
i965: Fold ABO state upload code into the SSBO/UBO state upload code.
Having this separate could potentially make programs that rebind atomics
but no other surfaces ever so slightly faster. But it's a tiny amount
of code to add to the existing UBO/SSBO atom, and very related.
The extra atoms have a cost on every draw call, and so dropping some of
them would be nice. This also reclaims a dirty bit.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Kenneth Graunke [Fri, 3 Nov 2017 21:52:05 +0000 (14:52 -0700)]
i965: Use nir_lower_atomics_to_ssbos and delete ABO compiler code.
We use the same hardware mechanism for both atomic counters and SSBO
atomics, so there's really no benefit to maintaining separate code to
handle each case. Instead, we can just use Rob's shiny new NIR pass to
convert atomic_uints to SSBOs, and delete piles of code.
The ssbo_start section of the binding table becomes a combined ABO and
SSBO section, with ABOs first, then SSBOs.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Kenneth Graunke [Tue, 7 Nov 2017 00:05:43 +0000 (16:05 -0800)]
i965: Make a better helper function for UBO/SSBO/ABO surface handling.
This fixes the missing AutomaticSize handling in the ABO code, removes
a bunch of duplicated code, and drops an extra layer of wrapping around
brw_emit_buffer_surface_state().
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Samuel Pitoiset [Tue, 14 Nov 2017 16:27:29 +0000 (17:27 +0100)]
radv: add the vertex buffers BO to the list at bind time
This should reduce the overhead of adding a BO to the current
list, especially when the list is huge. Also, when a new pipeline
is bound, we only need to update the descriptor, the buffer objects
should already be in the list.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Tue, 14 Nov 2017 16:27:28 +0000 (17:27 +0100)]
radv: replace vb_dirty with RADV_CMD_DIRTY_VERTEX_BUFFER
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Tue, 14 Nov 2017 16:27:27 +0000 (17:27 +0100)]
radv: drop radv_cmd_dirty_mask_t typedef
I don't think we will need a 64-bit unsigned integer for the
dirty flags in the future, and there is still 20 bits left.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Tue, 14 Nov 2017 16:29:18 +0000 (17:29 +0100)]
radv: use an unsigned 32-bit integer for radv_queue::family_index
VkDeviceQueueCreateInfo::queueFamilyIndex is an unsigned 32-bit
integer.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Tue, 14 Nov 2017 15:38:20 +0000 (16:38 +0100)]
radv: do not add the image BO in radv_set_dcc_need_cmask_elim_pred()
radv_fill_buffer() ensures that the image BO is added to the list.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Tue, 14 Nov 2017 15:38:19 +0000 (16:38 +0100)]
radv: do not add the image BO in radv_set_color_clear_regs()
radv_fill_buffer() ensures that the image BO is added to the list.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Roland Scheidegger [Thu, 9 Nov 2017 18:53:49 +0000 (19:53 +0100)]
r600: set the number type correctly for float rts in cb setup
Float rts were always set as unorm instead of float.
Not sure of the consequences, but at least it looks like the blend clamp
would have been enabled, which is against the rules (only eg really bothered
to even attempt to specify this correctly, r600 always used clamp anyway).
Albeit r600 (not r700) setup still looks bugged to me due to never setting
BLEND_FLOAT32 which must be set according to docs...
Not sure if the hw really cares, no piglit change (on eg/juniper).
Reviewed-by: Dave Airlie <airlied@redhat.com>
Roland Scheidegger [Thu, 9 Nov 2017 18:50:41 +0000 (19:50 +0100)]
r600: use ieee version of rsq
Both r600 and evergreen used the clamped version, whereas cayman used the
ieee one. I don't think there's a valid reason for this discrepancy, so let's
switch to the ieee version for r600 and evergreen too, since we generally
want to stick to ieee arithmetic.
With this, behavior for both rcp and rsq should now be the same for all of
r600, eg, cm, all using ieee versions (albeit note rsq retains the abs
behavior for everybody, which may not be a good idea ultimately).
Reviewed-by: Dave Airlie <airlied@redhat.com>
Roland Scheidegger [Thu, 9 Nov 2017 18:44:23 +0000 (19:44 +0100)]
r600: use ieee version of rcp
r600 used the clamped version for rcp, whereas both evergreen and cayman
used the ieee version. I don't know why that discrepancy exists (it does so
since day 1) but there does not seem to be a valid reason for this, so make
it consistent. This seems now safer than before the previous commit (using
the dx10 clamp bit).
Note that rsq still uses clamped version (as before even though the table
may have suggested otherwise for evergreen) for r600/eg, but not for cayman.
Will be changed separately for better regression tracking...
Reviewed-by: Dave Airlie <airlied@redhat.com>
Roland Scheidegger [Thu, 9 Nov 2017 18:41:29 +0000 (19:41 +0100)]
r600: use DX10_CLAMP bit in shader setup
The docs are not very concise in what this really does, however both
Alex Deucher and Nicolai Hähnle suggested this only really affects instructions
using the CLAMP output modifier, and I've confirmed that with the newly
changed piglit isinf_and_isnan test.
So, with this bit set, if an instruction has the CLAMP modifier bit (which
clamps to [0,1]) set, then NaNs will be converted to zero, otherwise the result
will be NaN.
D3D10 would require this, glsl doesn't have modifiers (with mesa
clamp(x,0,1) would get converted to such a modifier) coupled with a
whatever-floats-your-boat specified NaN behavior, but the clamp behavior
should probably always be used (this also matches what a decomposition into
min(1.0, max(x, 0.0)) would do, if min/max also adhere to the ieee spec of
picking the non-nan result).
Some apps may in fact rely on this, as this prevents misrenderings in
This War of Mine since using ieee muls
(
ce7a045feeef8cad155f1c9aa07f166e146e3d00), without having to use clamped
rcp opcode, which would also fix this bug there.
radeonsi also seems to set this bit nowadays if I see that righ (albeit the
llvm amdgpu code comment now says "Make clamp modifier on NaN input returns 0"
instead of "Do not clamp NAN to 0" since it was changed, which also looks
a bit misleading).
v2: set it in all shader stages.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103544
Reviewed-by: Dave Airlie <airlied@redhat.com>
Roland Scheidegger [Thu, 9 Nov 2017 18:37:54 +0000 (19:37 +0100)]
r600: use min_dx10/max_dx10 instead of min/max
I believe this is the safe thing to do, especially ever since the driver
actually generates NaNs for muls too.
The ISA docs are not very helpful here, however the dx10 versions will pick
a non-nan result over a NaN one (this is also the ieee754 behavior), whereas
the non-dx10 ones will pick the NaN (verified by newly changed piglit
isinf-and-isnan test).
Other "modern" drivers will most likely do the same.
This was shown to make some difference for bug 103544, albeit it is not
required to fix it.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Tue, 14 Nov 2017 22:26:23 +0000 (08:26 +1000)]
r600: fix cubemap arrays
A lot of cubemap array piglits fail, port the texture type
picking code from radeonsi which seems to fix most of them.
For images I will port the rest of the code.
Fixes:
getteximage-depth gl_texture_cube_map_array-*
fbo-generatemipmap-cubemap array
getteximage-targets cube_array
amongst others.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Rob Clark [Tue, 14 Nov 2017 23:09:38 +0000 (18:09 -0500)]
freedreno/a5xx: small comment fix
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Tue, 14 Nov 2017 19:40:40 +0000 (14:40 -0500)]
freedreno/a5xx: indirect draw support
A couple failures in piglit tests w/ TF or gl_VertexID + indirect draws.
OTOH all the deqp tests (although they don't test those combinations).
I suspect this could be fixed by a firmware update, but I don't think
there is much we can do in mesa for that.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Tue, 14 Nov 2017 19:15:27 +0000 (14:15 -0500)]
freedreno/a5xx: split out helper for pipeline stalls
We need a similar thing for indirect draws.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Tue, 14 Nov 2017 19:05:56 +0000 (14:05 -0500)]
freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
Timothy Arceri [Mon, 13 Nov 2017 00:34:31 +0000 (11:34 +1100)]
gallium/radeon: disable the cache when nir backend enabled
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Mon, 13 Nov 2017 01:55:34 +0000 (12:55 +1100)]
st/glsl_to_tgsi: use tgsi_get_gl_varying_semantic() for gs/tes outputs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Mon, 13 Nov 2017 01:43:36 +0000 (12:43 +1100)]
gallium/tgsi: add tess output supoort to tgsi_get_gl_varying_semantic()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Mon, 13 Nov 2017 00:58:59 +0000 (11:58 +1100)]
st/glsl_to_tgsi: make use of tgsi_get_gl_varying_semantic()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Mon, 13 Nov 2017 01:12:21 +0000 (12:12 +1100)]
gallium/tgsi: add prim id to tgsi_get_gl_varying_semantic()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Anuj Phogat [Mon, 13 Nov 2017 19:23:51 +0000 (11:23 -0800)]
i965: Make use of brw_load_register_imm32() helper function
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: Nanley Chery <nanley.g.chery@intel.com>
Anuj Phogat [Thu, 9 Nov 2017 19:30:10 +0000 (11:30 -0800)]
i965/gen8+: Fix the number of dwords programmed in MI_FLUSH_DW
Number of dwords in MI_FLUSH_DW changed from 4 to 5 in gen8+.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Anuj Phogat [Fri, 10 Nov 2017 22:39:17 +0000 (14:39 -0800)]
i965: Program DWord Length in MI_FLUSH_DW
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Anuj Phogat [Fri, 10 Nov 2017 22:22:44 +0000 (14:22 -0800)]
anv/gen10: Enable float blend optimization
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Anuj Phogat [Fri, 10 Nov 2017 22:22:18 +0000 (14:22 -0800)]
intel/genxml: Add Cache Mode SubSlice Register to gen10.xml
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Anuj Phogat [Tue, 7 Nov 2017 19:13:15 +0000 (11:13 -0800)]
anv/gen10: Implement WaSampleOffsetIZ workaround
We already have this workaround in OpenGL driver.
See Mesa commit
3cf4fe2219.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: Nanley Chery <nanley.g.chery@intel.com>
Cc: Rafael Antognolli <rafael.antognolli@intel.com>
Andres Rodriguez [Sat, 11 Nov 2017 00:07:24 +0000 (19:07 -0500)]
mesa/st: add missing copyright headers to memoryobjects files
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Andres Rodriguez [Sat, 11 Nov 2017 00:07:23 +0000 (19:07 -0500)]
mesa: minor tidy up for memory object error strings
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Andres Rodriguez [Sat, 11 Nov 2017 00:07:22 +0000 (19:07 -0500)]
broadcom/vc4: fix indentation in vc4_screen.c
Stumbled into this when adding a new PIPE_CAP.
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Matt Turner [Tue, 14 Nov 2017 19:24:08 +0000 (11:24 -0800)]
Revert "intel/fs: Use a pure vertical stride for large register strides"
This reverts commit
e8c9e65185de3e821e1e482e77906d1d51efa3ec.
With the actual bug fixed (by commit
6ac2d1690192), this is not
necessary. I'm doubtful of its correctness in any case.
Matt Turner [Fri, 10 Nov 2017 22:00:24 +0000 (14:00 -0800)]
i965/fs: Fix extract_i8/u8 to a 64-bit destination
The MOV instruction can extract bytes to words/double words, and
words/double words to quadwords, but not byte to quadwords.
For unsigned byte to quadword, we can read them as words and AND off the
high byte and extract to quadword in one instruction. For signed bytes,
we need to first sign extend to word and the sign extend that word to a
quadword.
Fixes the following test on CHV, BXT, and GLK:
KHR-GL46.shader_ballot_tests.ShaderBallotBitmasks
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103628
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Matt Turner [Wed, 8 Nov 2017 23:14:19 +0000 (15:14 -0800)]
i965/fs: Split all 32->64-bit MOVs on CHV, BXT, GLK
Fixes the following tests on CHV, BXT, and GLK:
KHR-GL46.shader_ballot_tests.ShaderBallotFunctionBallot
dEQP-VK.spirv_assembly.instruction.compute.uconvert.uint32_to_int64
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103115
Tim Rowley [Tue, 14 Nov 2017 00:39:38 +0000 (18:39 -0600)]
swr/rast: Faster emulated simd16 permute
Speed up simd16 frontend (default) on avx/avx2 platforms;
fixes performance regression caused by switch to simdlib.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Tim Rowley [Mon, 13 Nov 2017 21:11:21 +0000 (15:11 -0600)]
swr/rast: Use gather instruction for i32gather_ps on simd16/avx512
Speed up avx512 platforms; fixes performance regression caused
by swithc to simdlib.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Derek Foreman [Mon, 30 Oct 2017 20:52:22 +0000 (15:52 -0500)]
egl/wayland: Add a fallback when fourcc query isn't supported
When queryImage doesn't support __DRI_IMAGE_ATTRIB_FOURCC wayland clients
will die with a NULL derefence in wl_proxy_add_listener.
Attempt to provide a simple fallback to keep ancient systems working.
Fixes: 6595c699511 ("egl/wayland: Remove more surface specifics from
create_wl_buffer")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103519
Signed-off-by: Derek Foreman <derekf@osg.samsung.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>