Marek Olšák [Wed, 21 Nov 2018 23:06:54 +0000 (18:06 -0500)]
winsys/amdgpu: overallocate buffers for faster address translation on Gfx9
Sadly, the 3 games I tested (DeusEx:MD, DiRT Rally, DOTA 2) are unaffected
by the overallocation, because I guess their buffers don't fall into
the small range below a power-of-two size.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Marek Olšák [Fri, 23 Nov 2018 23:27:00 +0000 (18:27 -0500)]
winsys/amdgpu: increase the VM alignment to the MSB of the size for Gfx9
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Marek Olšák [Wed, 21 Nov 2018 22:28:13 +0000 (17:28 -0500)]
winsys/amdgpu: use >= instead of > for VM address alignment
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Marek Olšák [Fri, 23 Nov 2018 23:20:49 +0000 (18:20 -0500)]
winsys/amdgpu: clean up code around BO VM alignment
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Marek Olšák [Wed, 21 Nov 2018 07:15:11 +0000 (02:15 -0500)]
winsys/amdgpu: optimize slab allocation for 2 MB amdgpu page tables
- the slab buffer size increased from 128 KB to 2 MB (PTE fragment size)
- the max suballocated buffer size increased from 64 KB to 256 KB,
this increases memory usage because it wastes memory
- the number of suballocators increased from 1 to 3 and they are layered
on top of each other to minimize unused space in slabs
The final increase in memory usage is:
DeusEx:MD: 1.8%
DOTA 2: 1.75%
DiRT Rally: 0.2%
The kernel driver will also receive fewer buffers.
Marek Olšák [Wed, 21 Nov 2018 07:10:14 +0000 (02:10 -0500)]
radeonsi: generalize the slab allocator code to allow layered slab allocators
There is no change in behavior. It just makes it easier to change the number
of slab allocators.
Marek Olšák [Wed, 21 Nov 2018 05:22:48 +0000 (00:22 -0500)]
winsys/amdgpu: always reclaim/release slabs if there is not enough memory
Marek Olšák [Tue, 20 Nov 2018 03:29:00 +0000 (22:29 -0500)]
radeonsi: fix is_oneway_access_only for bindless images
Marek Olšák [Tue, 20 Nov 2018 03:27:49 +0000 (22:27 -0500)]
radeonsi/nir: parse more information about bindless usage
fill more tgsi_shader_info fields.
Marek Olšák [Tue, 20 Nov 2018 03:27:15 +0000 (22:27 -0500)]
tgsi/scan: add more information about bindless usage
radeonsi will use this.
Marek Olšák [Tue, 20 Nov 2018 02:54:37 +0000 (21:54 -0500)]
radeonsi: small cleanup for memory opcodes
Marek Olšák [Tue, 20 Nov 2018 02:53:55 +0000 (21:53 -0500)]
radeonsi: fix is_oneway_access_only for image stores
We need to look at the Dst for image stores.
Marek Olšák [Tue, 20 Nov 2018 01:36:35 +0000 (20:36 -0500)]
radeonsi: use structured buffer intrinsics for image views
to stop using the workaround in si_make_buffer_descriptor.
Marek Olšák [Wed, 21 Nov 2018 01:58:17 +0000 (20:58 -0500)]
radeonsi: clean up primitive binning enablement
no change in behavior.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Dave Airlie [Wed, 28 Nov 2018 23:07:35 +0000 (09:07 +1000)]
virgl: fix undefined shift to use unsigned.
Ported from virglrenderer.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 11 Oct 2018 03:44:02 +0000 (13:44 +1000)]
r600: make suballocator 256-bytes align
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108311
Cc: <mesa-stable@lists.freedesktop.org>
Kenneth Graunke [Mon, 26 Nov 2018 22:58:54 +0000 (14:58 -0800)]
intel/compiler: Use nir's info when checking uses_streams.
Vulkan and Gallium don't use Mesa's gl_program data structure, so they
can't poke at 'prog'. But we can simply use the copy of the shader info
stored with the NIR shader, which is guaranteed to exist.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Mon, 19 Nov 2018 18:32:16 +0000 (12:32 -0600)]
nir/derefs: Add a nir_derefs_do_not_alias enum value
This makes some of the code more clear.
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Gurchetan Singh [Wed, 28 Nov 2018 16:39:34 +0000 (08:39 -0800)]
egl: add missing #include <stddef.h> in egldevice.h
Otherwise, I get this error:
main/egldevice.h:54:13: error: ‘NULL’ undeclared (first use in this function)
dev = NULL;
^~~~
with this config:
./autogen.sh --enable-gles1 --enable-gles2 --with-platforms='surfaceless' --disable-glx
--with-dri-drivers="i965" --with-gallium-drivers="" --enable-gbm
v3: Use stddef.h (Matt)
v4: Modify commit message (Eric)
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Matt Turner [Wed, 21 Nov 2018 20:13:19 +0000 (12:13 -0800)]
gallivm: Use nextafterf(0.5, 0.0) as rounding constant
The common truncf(x + 0.5) fails for the floating-point value just less
than 0.5 (nextafterf(0.5, 0.0)). nextafterf(0.5, 0.0) + 0.5, after
rounding is 1.0, thus truncf does not produce the desired value.
The solution is to add nextafterf(0.5, 0.0) instead of 0.5 before
truncating. This works for all values.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Juan A. Suarez Romero [Wed, 28 Nov 2018 18:19:49 +0000 (19:19 +0100)]
docs: update calendar, add news item and link release notes for 18.2.6
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Juan A. Suarez Romero [Wed, 28 Nov 2018 18:14:21 +0000 (19:14 +0100)]
docs: add sha256 checksums for 18.2.6
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit
cfd1f8b92cae9dde3e5bed42109b5142f50a2ee5)
Juan A. Suarez Romero [Wed, 28 Nov 2018 17:39:26 +0000 (18:39 +0100)]
docs: add release notes for 18.2.6
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit
3e741344d79e3ae67b1ad645e7d56fe6c0fb2ae2)
Nicolai Hähnle [Wed, 28 Nov 2018 17:30:36 +0000 (18:30 +0100)]
egl/wayland: rather obvious build fix
Fixes: ce74a7bb8de7 ("egl/wayland: plug memory leak in drm_handle_device()")
Fixes: c59d3aa4b9bc ("egl/wayland: bail out when drmGetMagic fails")
Nicolai Hähnle [Wed, 21 Nov 2018 17:17:02 +0000 (18:17 +0100)]
winsys/amdgpu: explicitly declare whether buffer_map is permanent or not
Introduce a new driver-private transfer flag RADEON_TRANSFER_TEMPORARY
that specifies whether the caller will use buffer_unmap or not. The
default behavior is set to permanent maps, because that's what drivers
do for Gallium buffer maps.
This should eliminate the need for hacks in libdrm. Assertions are added
to catch when the buffer_unmap calls don't match the (temporary)
buffer_map calls.
I did my best to update r600 for consistency (r300 needs no changes
because it never calls buffer_unmap), even though the radeon winsys
ignores the new flag.
As an added bonus, this should actually improve the performance of
the normal fast path, because we no longer call into libdrm at all
after the first map, and there's one less atomic in the winsys itself
(there are now no atomics left in the UNSYNCHRONIZED fast path).
Cc: Leo Liu <leo.liu@amd.com>
v2:
- remove comment about visible VRAM (Marek)
- don't rely on amdgpu_bo_cpu_map doing an atomic write
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Wed, 21 Nov 2018 16:34:38 +0000 (17:34 +0100)]
winsys/amdgpu: add amdgpu_winsys_bo::lock
We'll use it in the upcoming mapping change. Sparse buffers have always
had one.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Eric Engestrom [Tue, 20 Nov 2018 17:35:27 +0000 (17:35 +0000)]
vulkan/wsi: fix s/,/;/ typo
Fixes: 59e58c348e6af16a5f2dd "vulkan/wsi: Only wait on semaphores on the first swapchain"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Emil Velikov [Tue, 27 Nov 2018 11:36:01 +0000 (11:36 +0000)]
egl/wayland: plug memory leak in drm_handle_device()
As we fail to open the node, we leak the node/device name.
v2: Log and then free() (Eric)
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Emil Velikov [Fri, 23 Nov 2018 12:55:38 +0000 (12:55 +0000)]
egl/wayland: bail out when drmGetMagic fails
Currently as the function fails, we pass uninitialized data to the
authentication function. Stop doing that and print an warning when
the function fails.
v2: Plug memory leak in error path (Eric)
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Eric Engestrom [Tue, 27 Nov 2018 13:34:37 +0000 (13:34 +0000)]
wsi/display: fix mem leak when freeing swapchains
Fixes: da997ebec92942193955 "vulkan: Add KHR_display extension using DRM [v10]"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Keith Packard <keithp@keithp.com>
Gert Wollny [Thu, 22 Nov 2018 18:00:03 +0000 (19:00 +0100)]
i965: Set the FBO error state INCOMPLETE_ATTACHMENT only for SRGB_R8
Originally the driver reported GL_FRAMEBUFFER_UNSUPPORTED in all cases,
adding more specific error messages was not correct and broke many tests.
Mostly revert this and only report GL_FRAMEBUFFER_INCOMPLETE_ATTACHMENT
for MESA_FORMAT_R_SRGB8.
Fixes: ebcde3454552adc6d3fea8af2207aafaba857796
i965: be more specific about FBO completeness errors
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108805
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Gert Wollny [Thu, 22 Nov 2018 18:00:02 +0000 (19:00 +0100)]
i965: Explicitely handle swizzles for MESA_FORMAT_R_SRGB8
The format is emulated by using ISL_FORMAT_L8_SRGB, therefore we need to
force swizzles for the GBA channels. However, doing this only based on the
data type GL_RED breaks other formats, therefore, test specifically for the
format.
Fixes: c5363869d4971780401b21bb75083ef2518c12be
i965: Force zero swizzles for unused components in GL_RED and GL_RG
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Gert Wollny [Tue, 27 Nov 2018 19:50:44 +0000 (20:50 +0100)]
virgl: Don't try handling server fences when they are not supported
vtest doesn't implement the according API and would segfault:
Program received signal SIGSEGV, Segmentation fault.
#0 0x0000000000000000 in ?? ()
#1 in virgl_fence_server_sync at
src/gallium/drivers/virgl/virgl_context.c:1049
#2 in st_server_wait_sync at
src/mesa/state_tracker/st_cb_syncobj.c:155
so just don't do the call when the function pointers are not set.
Fixes dEQP:
dEQP-GLES3.functional.fence_sync.wait_sync_smalldraw
dEQP-GLES3.functional.fence_sync.wait_sync_largedraw
Fixes: d1a1c21e7621b5177febf191fcd3d3b8ef69dc96
virgl: native fence fd support
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Robert Foss <robert.foss@collabora.com>
Gert Wollny [Tue, 27 Nov 2018 19:50:43 +0000 (20:50 +0100)]
virgl,vtest: Initialize return value
Avoids:
Conditional jump or move depends on uninitialised value(s)
at 0x9E2B39F: virgl_vtest_winsys_resource_cache_create (virgl_vtest_winsys.c:379)
by 0x9E2725F: virgl_buffer_create (virgl_buffer.c:169)
by 0x9E246D5: virgl_resource_create (virgl_resource.c:60)
by 0xA0C1B9F: bufferobj_data (st_cb_bufferobjects.c:344)
by 0xA0C1B9F: st_bufferobj_data (st_cb_bufferobjects.c:390)
by 0x9F4ACE3: vbo_use_buffer_objects (vbo_exec_api.c:1136)
by 0xA0C68C3: st_create_context_priv (st_context.c:416)
by 0xA0C707A: st_create_context (st_context.c:598)
by 0x9F81C6B: st_api_create_context (st_manager.c:918)
by 0x9BBE591: dri_create_context (dri_context.c:161)
by 0x9BB6931: driCreateContextAttribs (dri_util.c:473)
by 0x4E97A44: drisw_create_context_attribs (drisw_glx.c:630)
by 0x4E7C591: glXCreateContextAttribsARB (create_context.c:78)
Uninitialised value was created by a stack allocation
at 0x9E2B249: virgl_vtest_winsys_resource_cache_create (virgl_vtest_winsys.c:342)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Robert Foss <robert.foss@collabora.com>
Iago Toral Quiroga [Tue, 27 Nov 2018 07:57:13 +0000 (08:57 +0100)]
intel/compiler: fix register allocation in opt_peephole_sel
This wasn't handling 64-bit cases properly. Found by inspection.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Tue, 30 Oct 2018 12:36:02 +0000 (05:36 -0700)]
glsl: Remove unused member variable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Matt Turner [Mon, 26 Nov 2018 19:29:41 +0000 (11:29 -0800)]
nir: Call fflush() at the end of nir_print_shader()
We normally call with stderr which is unbuffered, so this won't affect
that, but it does let me call nir_print_shader(nir, fopen("log", "w+"))
from gdb and actually get the whole shader in my file.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Eric Anholt [Sat, 27 Oct 2018 01:02:20 +0000 (18:02 -0700)]
v3d: Add renderonly support.
I've been using this with the kmsro series to test v3d on VKMS without my
old KMS hack in the v3d kernel driver. KMSRO still needs some cleanup,
but v3d RO support seems reasonable.
Eric Anholt [Tue, 27 Nov 2018 19:25:09 +0000 (11:25 -0800)]
gallium: Remove unused variable in u_tests.
Fixes: 0d17b685b1ff ("gallium/u_tests: add a compute shader test that clears an image")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bas Nieuwenhuizen [Mon, 26 Nov 2018 02:28:05 +0000 (03:28 +0100)]
radv: Align large buffers to the fragment size.
Improves performance in Talos by about 15% (and significant improvements
in RotR and possibly other but did not bench with final patch) on
kernel 4.19 and earlier.
On 4.20+ a similar effect comes from
433ca054949a "drm/amdgpu: try allocating VRAM as power of two"
v2: Do not impact the alignment of the physical memory.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
CC: <mesa-stable@lists.freedesktop.org>
Hyunjun Ko [Wed, 7 Nov 2018 02:30:31 +0000 (11:30 +0900)]
freedreno: implements get_sample_position
Since
1285f71d3e landed, it needs to provide apps with proper sample
position for MSAA.
Currently no way to query this to hw, these are taken from blob driver.
Fixes: dEQP-GLES31.functional.texture.multisample.samples_#.sample_position
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Fri, 23 Nov 2018 16:30:34 +0000 (11:30 -0500)]
freedreno/a3xx: also set FSSUPERTHREADENABLE
We set equiv bit in SP_FS_CTRL_REG0. Somehow the hw doesn't hang with
this mismatched config, but does run slower. It is faster with either
neither bit set, or both bits set, but both is the fastest of the three
configurations. Worth a bit over 10% gain in glmark2.
Spotted-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Jonathan Marek [Mon, 19 Nov 2018 21:02:15 +0000 (16:02 -0500)]
freedreno: use MSM_BO_SCANOUT with scanout buffers
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Jonathan Marek [Tue, 13 Nov 2018 16:42:33 +0000 (11:42 -0500)]
freedreno: use GENERIC instead of TEXCOORD for blit program
blip_fp uses GENERIC as input, so blit_vp should match for linking
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Jonathan Marek [Tue, 13 Nov 2018 16:40:03 +0000 (11:40 -0500)]
freedreno: a2xx texture update
Adds all missing texture related logic. For everything to work it also
needs changes to ir2/fd2_program, which are part of the ir2 update patch.
Note: it needs rnndb update
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
[remove stray patch]
Signed-off-by: Rob Clark <robdclark@gmail.com>
Jonathan Marek [Tue, 13 Nov 2018 16:30:31 +0000 (11:30 -0500)]
freedreno/a2xx: Compute depth base in gmem correctly
Note: it needs rnndb update
Signed-off-by: Marek Vasut <marex@denx.de>
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Jonathan Marek [Tue, 13 Nov 2018 16:26:38 +0000 (11:26 -0500)]
freedreno/a2xx: set VIZ_QUERY_ID on a20x
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Jonathan Marek [Tue, 13 Nov 2018 16:19:39 +0000 (11:19 -0500)]
freedreno: add missing a20x ids
200: 256KiB GMEM A200 (imx53)
201: 128KiB GMEM A200 (imx51)
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Jonathan Marek [Tue, 13 Nov 2018 16:17:48 +0000 (11:17 -0500)]
freedreno/a2xx: fix POINT_MINMAX_MAX overflow
As it stands, it overflows to zero.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Jonathan Marek [Tue, 13 Nov 2018 16:04:32 +0000 (11:04 -0500)]
freedreno: a2xx: fd2_draw update
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Jonathan Marek [Mon, 12 Nov 2018 17:49:32 +0000 (12:49 -0500)]
nir: add fceil lowering
lowers ceil(x) as -floor(-x)
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Sun, 18 Nov 2018 15:02:47 +0000 (10:02 -0500)]
freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Mon, 19 Nov 2018 15:24:40 +0000 (10:24 -0500)]
freedreno/a6xx: set guardband clip
On older gens, the CLIP_ADJ bitfields were actually 3.6 fixed point.
Which might make more sense. Although this formula comes up with values
pretty close to what blob does for various viewport sizes (for at least
a5xx and a6xx), and seems to work.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Tue, 13 Nov 2018 19:49:25 +0000 (14:49 -0500)]
freedreno/a6xx: disable LRZ for z32
f6131d4ec7a had the side effect of enabling LRZ w/ 32b depth buffers.
But there are some bugs with this, which aren't fully understood yet,
so for now just skip LRZ w/ z32..
Fixes: f6131d4ec7a freedreno/a6xx: Clear z32 and separate stencil with blitter
Signed-off-by: Rob Clark <robdclark@gmail.com>
Kristian H. Kristensen [Fri, 19 Oct 2018 21:29:49 +0000 (14:29 -0700)]
freedreno/a6xx: Clear gmem buffers at flush time
We generate an IB to clear the gmem at flush time and jump to it
before rendering each tile. This lets us get rid of the command stream
patching for gmem offsets.
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Kristian H. Kristensen [Wed, 24 Oct 2018 07:00:50 +0000 (00:00 -0700)]
freedreno/a6xx: Move resolve blits to an IB
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Kristian H. Kristensen [Wed, 24 Oct 2018 06:50:34 +0000 (23:50 -0700)]
freedreno/a6xx: Move restore blits to IB
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Tue, 13 Nov 2018 19:19:38 +0000 (14:19 -0500)]
mesa/st: better colormask check for clear fallback
For RGB surfaces (for example) we don't really care that the colormask
is 0x7 instead of 0xf. This should not trigger clear_with_quad()
slowpath.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Rob Clark [Tue, 13 Nov 2018 18:40:58 +0000 (13:40 -0500)]
mesa/st: swap order of clear() and clear_with_quad()
If we can't clear all the buffers with pctx->clear() (say, for example,
because of ColorMask), push the buffers we *can* clear with pctx->clear()
first. Tilers want to see clears coming before draws to enable fast-
paths, and clearing one of the attachments with a quad-draw first
confuses that logic.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Rob Clark [Sat, 10 Nov 2018 17:05:59 +0000 (12:05 -0500)]
freedreno: move ir3 to common location
Move (most of) the ir3 compiler to src/freedreno/ir3 so that it can be
re-used by some future vulkan driver. The parts that are gallium
specific have been refactored out and remain in the gallium driver.
Getting the move done now so that it can happen before further
refactoring to support a6xx specific instructions.
NOTE also removes ir3_cmdline compiler tool from autotools build since
that was easier than fixing it and I normally use meson build. Waiting
patiently for the day that we can remove *everything* from the autotools
build.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Sat, 10 Nov 2018 15:32:36 +0000 (10:32 -0500)]
freedreno/ir3: remove u_inlines usage
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Sat, 10 Nov 2018 15:11:18 +0000 (10:11 -0500)]
freedreno/ir3: split up ir3_shader
Split the parts that are gallium specific into ir3_gallium so the rest
can move to a common location outside of gallium.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Fri, 9 Nov 2018 19:46:28 +0000 (14:46 -0500)]
freedreno/ir3: remove pipe_stream_output_info dependency
A bit annoying to have to copy into our own struct. But this is
something the compiler really needs to know, at least on earlier
generations where streamout is implemented in shader.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Fri, 9 Nov 2018 14:40:20 +0000 (09:40 -0500)]
freedreno/ir3: some header file cleanup
Clean up some of the low-hanging-fruit usages of freedreno_util.h
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Fri, 9 Nov 2018 18:49:55 +0000 (13:49 -0500)]
freedreno/ir3: use env_var_as_unsigned()
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Fri, 9 Nov 2018 18:28:36 +0000 (13:28 -0500)]
util: env_var_as_unsigned() helper
So I can drop env2u() helper from freedreno_util.h and get rid of one
small ir3 dependency on gallium/freedreno
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Fri, 9 Nov 2018 16:08:16 +0000 (11:08 -0500)]
freedreno/ir3: move disasm and optmsgs debug flags
Move them to IR3_SHADER_DEBUG so we can remove ir3's dependency on
fd_mesa_debug.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Fri, 9 Nov 2018 15:57:18 +0000 (10:57 -0500)]
freedreno: FD_SHADER_DEBUG -> IR3_SHADER_DEBUG
Only used by ir3, so move it into ir3 to be more self contained.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Fri, 9 Nov 2018 15:50:48 +0000 (10:50 -0500)]
freedreno: remove shader_stage_name()
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Fri, 9 Nov 2018 15:47:14 +0000 (10:47 -0500)]
freedreno: shader_t -> gl_shader_stage
Just massive search/replace for the most part.
Step towards removing ir3 dependency on disasm.h which is shared by
a2xx. One step closer to being able to move ir3 out of gallium.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Fri, 9 Nov 2018 14:17:03 +0000 (09:17 -0500)]
freedreno/ir3: standalone compiler updates
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Sun, 11 Nov 2018 15:10:46 +0000 (10:10 -0500)]
freedreno: move drm to common location
So that we can re-use at least parts of it for vulkan driver, and so
that we can move ir3 to a common location (which uses fd_bo to allocate
storage for shaders)
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Sun, 11 Nov 2018 15:16:30 +0000 (10:16 -0500)]
freedreno/drm: remove dependency on gallium driver
Prep work to move drm to a common location.
Slightly hacky, but the softpin debug flag is only temporary.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Dylan Baker [Fri, 9 Nov 2018 18:40:15 +0000 (10:40 -0800)]
util: promote u_memory to src/util
as well as os_memory*
Reviewed-by: Rob Clark <robdclark@gmail.com>
Eric Anholt [Mon, 26 Nov 2018 21:11:31 +0000 (13:11 -0800)]
gallium: Fix uninitialized variable warning in compute test.
The compiler doesn't know that ny != 0, so x might be uninitialized for
the printf at the end.
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Bas Nieuwenhuizen [Sat, 24 Nov 2018 22:21:05 +0000 (23:21 +0100)]
radv: Clamp gfx9 image view extents to the allocated image extents.
Mirrors AMDVLK. Looks like if we go over the alignment of height
we actually start to change the addressing. Seems like the extra
miplevels actually work with this.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108245
Fixes: f6cc15dccd5 "radv/gfx9: fix block compression texture views. (v2)"
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Iago Toral Quiroga [Tue, 27 Nov 2018 08:43:12 +0000 (09:43 +0100)]
intel/compiler: fix indentation style in opt_algebraic()
Anuj Phogat [Fri, 12 Oct 2018 21:13:21 +0000 (14:13 -0700)]
anv/icl: Set use full ways in L3CNTLREG
L3 allocation table in h/w specification recommends using 4 KB
granularity for programming allocation fields in L3CNTLREG.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Anuj Phogat [Wed, 17 Oct 2018 22:16:37 +0000 (15:16 -0700)]
intel/icl: Set way_size_per_bank to 4
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Anuj Phogat [Thu, 11 Oct 2018 17:52:16 +0000 (10:52 -0700)]
i965/icl: Set use full ways in L3CNTLREG
L3 allocation table in h/w specification recommends using 4 KB
granularity for programming allocation fields in L3CNTLREG.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Anuj Phogat [Tue, 2 Oct 2018 11:28:10 +0000 (04:28 -0700)]
i965/icl: Fix L3 configurations
Use L3 configuration specified in h/w specification.
V2: Drop configs which do under allocation of l3 cache.
Bump up the comment above table.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Eric Engestrom [Fri, 23 Nov 2018 16:37:50 +0000 (16:37 +0000)]
build: stop defining unused VERSION
Scons and autotools don't define it, and as of last commit nothing
uses it.
`VERSION` is also a generic enough name that something somewhere will
eventually clash, and we don't want to repeat the LLVM `DEBUG` fiasco.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Eric Engestrom [Fri, 23 Nov 2018 15:13:02 +0000 (15:13 +0000)]
vulkan/utils: s/VERSION/PACKAGE_VERSION/
Everything else uses PACKAGE_VERSION, so let's be consistent, and
VERSION and PACKAGE_VERSION are currently defined to be the same in
meson and android, while VERSION is undefined in autotools and scons.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Eric Engestrom [Fri, 23 Nov 2018 17:08:28 +0000 (17:08 +0000)]
anv: correctly use vulkan 1.0 by default
Per chapter 3.2 "Instances":
> Providing a NULL VkInstanceCreateInfo::pApplicationInfo or providing
> an apiVersion of 0 is equivalent to providing an apiVersion of
> VK_MAKE_VERSION(1,0,0).
Reported-by: Niklas Haas <git@haasn.xyz>
Fixes: 8c048af5890d43578ca4 "anv: Copy the appliation info into the instance"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Erik Faye-Lund [Mon, 26 Nov 2018 19:36:04 +0000 (20:36 +0100)]
mesa/main: fixup requirements for GL_PRIMITIVES_GENERATED
This enum is also allowed by EXT_tessellation_shader, which is supported
on older i965 HW (as opposed to OES_geometry_shader). This was missed
when narrowing this code-path, leading to dEQP regressions.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108868
Fixes: f09d94fbd11 "mesa/main: fix validation of transform-feedback queries"
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Erik Faye-Lund [Thu, 22 Nov 2018 10:10:50 +0000 (11:10 +0100)]
mesa/main: fix incorrect depth-error
If glGetTexImage or glGetnTexImage is called with a level that doesn't
exist, we get an error message on this form:
Mesa: User error: GL_INVALID_VALUE in glGetTexImage(depth = 0)
This is clearly nonsensical, because these APIs don't even have a
depth-parameter. The reason is that get_texture_image_dims() return
all-zero dimensions for non-existent texture-images, and we go on to
validate these dimensions as if they were user-input, because
glGetTextureSubImage requires checking.
So let's split this logic in two, so glGetTextureSubImage can have
stricter input-validation. All arguments that are no longer validated
are generated internally by mesa, so there's no use in validating them.
Fixes: 42891dbaa12 "gettextsubimage: verify zoffset and depth are correct"
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Erik Faye-Lund [Thu, 22 Nov 2018 16:40:47 +0000 (17:40 +0100)]
mesa/main: check cube-completeness in common code
This check is the only part of dimensions_error_check that isn't about
error-checking the offset and size arguments of
glGet[Compressed]TextureSubImage(), so it doesn't really belong in here.
This doesn't make a difference right now, apart for changing the
presedence of this error. But it will make a difference for the next
patch, where we no longer call this method from the non-sub tex-image
getters.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Erik Faye-Lund [Thu, 22 Nov 2018 11:37:33 +0000 (12:37 +0100)]
mesa/main: factor out common error-checking
This error checking is the same for teximage and texsubimage getters, so
let's factor it out to its own function.
This will be useful when getteximage and gettexsubimage gets their own
error checking routines a bit later.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Erik Faye-Lund [Thu, 22 Nov 2018 11:17:32 +0000 (12:17 +0100)]
mesa/main: factor out tex-image error-checking
This will be useful when we split error-checking for getteximage and
gettexsubimage later.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Erik Faye-Lund [Thu, 22 Nov 2018 14:17:13 +0000 (15:17 +0100)]
mesa/main: remove bogus error for zero-sized images
The explanation quotes the spec on the following wording to justify the
error:
"An INVALID_VALUE error is generated if xoffset + width is greater than
the texture’s width, yoffset + height is greater than the texture’s
height, or zoffset + depth is greater than the texture’s depth."
However, this shouldn't generate an error in the case where *all three*
of width, xoffset and the texture's width are zero. In this case, we end
up generating an unspecified error.
So let's remove this check, and instead make sure that we consider this
as an empty texture.
So let's not generate an error, there's non mandated in the spec in
xoffset/yoffset/zoffset = 0 case. We already avoid doing any work in
this case, because of the final, non-error generating check in this
function.
Fixes: b37b35a5d26 "getteximage: assume texture image is empty for non defined levels"
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Erik Faye-Lund [Wed, 21 Nov 2018 19:09:46 +0000 (20:09 +0100)]
mesa/main: remove ARB suffix from glGetnTexImage
This function has been core since OpenGL 4.3, so naming the
implementation and reporting erros using an ARB-suffix can be
confusing.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Gert Wollny [Fri, 16 Nov 2018 18:12:46 +0000 (19:12 +0100)]
glsl: free or reuse memory allocated for TF varying
When a shader program is de-serialized the gl_shader_program passed in
may actually still hold memory allocations for the transform feedback
varyings. If that is the case, free the varying names and reallocate
the new storage for the names array.
This fixes a memory leak:
Direct leak of 48 byte(s) in 6 object(s) allocated from:
in malloc (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0xdb880)
in transform_feedback_varyings ../../samba/mesa/src/mesa/main/transformfeedback.c:875
in _mesa_TransformFeedbackVaryings ../../samba/mesa/src/mesa/main/transformfeedback.c:985
...
Indirect leak of 42 byte(s) in 6 object(s) allocated from:
in __interceptor_strdup (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0x761c8)
in transform_feedback_varyings ../../samba/mesa/src/mesa/main/transformfeedback.c:887
in _mesa_TransformFeedbackVaryings ../../samba/mesa/src/mesa/main/transformfeedback.c:985
Fixes: ab2643e4b06f63c93a57624003679903442634a8
glsl: serialize data from glTransformFeedbackVaryings
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Bas Nieuwenhuizen [Sat, 24 Nov 2018 19:52:20 +0000 (20:52 +0100)]
radv: Fix opaque metadata descriptor last layer.
We used the layer count which results in an off by one error.
Not sure this really affects anything.
Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
Mathias Fröhlich [Thu, 1 Nov 2018 18:03:26 +0000 (19:03 +0100)]
mesa/st: Make st_pipe_vertex_format static.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Mathias Fröhlich [Thu, 1 Nov 2018 18:03:26 +0000 (19:03 +0100)]
mesa/st: Use binding information from the VAO in feedback rendering.
Use VAO binding information in feedback rendering. In theory
it should reduce the amount of buffer objects scheduled for rendering.
Feedback rendering is implemented in a crude way anyhow, so I do not
expect much gain here. But for the sake of code reuse we should
use the same code for the same task. And finally if feeback rendering
may get improved the array setup is already well done there.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Mathias Fröhlich [Thu, 1 Nov 2018 18:03:26 +0000 (19:03 +0100)]
mesa/st: Avoid extra references in the feedback draw function scope.
The change removes the reference that is held on the entries of the
vbuffers[] array. The new code does not do that anymore as following
the code into draw_set_vertex_buffers() the draw context holds an
other reference as long as it is reset down the function again.
So it should be already by that argument save to remove that
additional reference count.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Mathias Fröhlich [Thu, 1 Nov 2018 18:03:26 +0000 (19:03 +0100)]
mesa/st: Factor out array and buffer setup from st_atom_array.c.
Factor out vertex array setup routines from the array state atom.
The factored functions will be used in feedback rendering in the
next change.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Mathias Fröhlich [Thu, 1 Nov 2018 18:03:26 +0000 (19:03 +0100)]
mesa/st: Only unmap the uploader that was actually used.
In st_atom_array, we only need to unmap the upload buffer that
was actually used.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Mathias Fröhlich [Thu, 1 Nov 2018 18:03:26 +0000 (19:03 +0100)]
mesa/st: Only care about the uploader if it was used.
In st_atom_array, we only need to care for unmapping the upload buffer
if we actually used it.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Ilia Mirkin [Sun, 25 Nov 2018 02:56:00 +0000 (21:56 -0500)]
nv50/ir: remove dnz flag when converting MAD to ADD due to optimizations
dnz flag only applies for multiplications (e.g. to make 0 * Infinity
becomes 0 instead of NaN). Once we optimize a MAD into an ADD, the dnz
flag no longer makes sense, and upsets the GM107 emitter (since it looks
at the ftz and dnz flags together).
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>