Dylan Baker [Thu, 16 Nov 2017 01:09:33 +0000 (17:09 -0800)]
meson: Remove completed or irrelevant TODO comments
These are all either done already, or are autotools specific. The
misspelled gallium G3DVL is the autotools specific bit, meson is
handling that via build_by_default.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Dylan Baker [Thu, 16 Nov 2017 01:07:37 +0000 (17:07 -0800)]
meson: Fix TODO for missing dl_iterate_phdr function
This function is required for both the Intel "Anvil" vulkan driver and
the i965 GL driver. Error out if either of those is enabled but this
function isn't found.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Dylan Baker [Thu, 16 Nov 2017 00:53:40 +0000 (16:53 -0800)]
meson: disable x86 asm in fewer cases.
This patch allows building asm for x86 on x86_64 platforms, when the
operating system is the same. Previously cross compile always turned off
assembly. This allows using a cross file to cross compile x86 binaries
on x86_64 with asm.
This could probably be relaxed further thanks to meson's "exe_wrapper",
which is way to specify an emulator or compatibility layer (wine) that
can run the foreign binaries on the build system. Since the meson build
at this point only supports building on Linux I can't test this and I
don't want to write/enable code that cannot even be build tested.
v4: - set condition to build == x86_64 and host == x86 and
build.system == host.system
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Dylan Baker [Thu, 16 Nov 2017 00:09:22 +0000 (16:09 -0800)]
meson: Enable SSE4.1 optimizations
This patch checks for an and then enables sse4.1 optimizations if the
host machine will be x86/x86_64.
v2: - Don't compile code, it's unnecessary since we require a compiler
which always has SSE4.1 (Matt)
v3: - x64 -> x86_64 (Matt)
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Eric Anholt [Wed, 22 Nov 2017 00:32:33 +0000 (16:32 -0800)]
broadcom/vc5: Fix BASE_LEVEL handling with txl.
The HW doesn't add the base level anywhere (the min/max lod clamping is
what does base level), so we need to add it manually in this case.
Fixes piglit tex-miplevel-selection *Lod 2D.
Eric Anholt [Wed, 22 Nov 2017 00:05:49 +0000 (16:05 -0800)]
broadcom/vc5: Fix array texture layer count setup.
Fixes piglit array-texture.
Eric Anholt [Tue, 21 Nov 2017 23:27:20 +0000 (15:27 -0800)]
broadcom/vc5: Don't increment primitive queries while they're paused.
Fixes ext_transform_feedback-generatemipmap prims_generated
Eric Anholt [Tue, 21 Nov 2017 23:20:31 +0000 (15:20 -0800)]
broadcom/vc5: Fix incorrect padding of TF outputs.
After the first output, we were padding by an extra size of the previous
output. Fixes piglit ext_transform_feedback-output-type mat4x3[2] and
friends.
Eric Anholt [Tue, 21 Nov 2017 23:00:36 +0000 (15:00 -0800)]
broadcom/vc5: Fix UIF surface size setup for ARB_fbo's mismatched sizes.
The HW was computing an implicit height for the surface based on the image
size, but that may be smaller than the surface with ARB_fbo mismatched
sizes. In that case, we need to tell it about the pad, either with the
little 4-bit field in the RT config, or the extended field in
CLEAR_COLORS_PART3.
Fixes piglit arb_framebuffer_object-mixed-buffer-sizes.
Wladimir J. van der Laan [Sat, 18 Nov 2017 09:44:25 +0000 (10:44 +0100)]
etnaviv: Put HALTI level in specs
The HALTI level is an indication of the gross architecture of the GPU.
It determines for significant part what feature level the GPU has, what
state (especially frontend state) is there, and where it is located.
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Wladimir J. van der Laan [Sat, 18 Nov 2017 09:44:24 +0000 (10:44 +0100)]
etnaviv: Const-correctness etnaviv_emit.h
The relocation structure is never changed by submitting it.
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Juan A. Suarez Romero [Tue, 21 Nov 2017 11:38:27 +0000 (12:38 +0100)]
meson: add si_driinfo.h in libgallium_dri
v2: generate target conditionally (Dylan)
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Iago Toral Quiroga [Thu, 16 Nov 2017 07:53:07 +0000 (08:53 +0100)]
nir/gather_info: recognize load_patch_vertices_in as a system value
This intrinsic is produced to load SYSTEM_VALUE_VERTICES_IN, which is
generated to load gl_PatchVerticesIn in the SPIR-V path for both
Vulkan and OpenGL.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Jordan Justen [Wed, 15 Nov 2017 00:27:34 +0000 (16:27 -0800)]
i965: Support decoding INTERFACE_DESCRIPTOR_DATA with INTEL_DEBUG=bat
This will dump the INTERFACE_DESCRIPTOR_DATA along with the associated
samplers & surfaces.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Kristian H. Kristensen [Wed, 30 Nov 2016 05:07:57 +0000 (21:07 -0800)]
intel/genxml: Add helpers for determining field type
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Matt Turner [Mon, 20 Nov 2017 22:21:43 +0000 (14:21 -0800)]
i965/fs: Check ADD/MAD with immediates in satprop unit test
The gen had to be changed from 4 to 6 so that we could test MAD, which
is new on Gen6.
mad_imm_float_neg_mov_sat tests the case fixed by the previous commit.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Mon, 20 Nov 2017 22:24:57 +0000 (14:24 -0800)]
i965/fs: Handle negating immediates on MADs when propagating saturates
MADs don't take immediate sources, but we allow them in the IR since it
simplifies a lot of things. I neglected to consider that case.
Fixes: 4009a9ead490 ("i965/fs: Allow saturate propagation to propagate
negations into MADs.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103616
Reported-and-Tested-by: Ruslan Kabatsayev <b7.10110111@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Juan A. Suarez Romero [Wed, 15 Nov 2017 16:49:21 +0000 (16:49 +0000)]
mesa/teximage: add TEXTURE_CUBE_MAP_ARRAY target for CompressedTexImage3D
From section 8.7, page 179 of OpenGL ES 3.2 spec:
An INVALID_OPERATION error is generated by CompressedTexImage3D
if internalformat is one of the the formats in table 8.17 and target
is not TEXTURE_2D_ARRAY, TEXTURE_CUBE_MAP_ARRAY or TEXTURE_3D.
An INVALID_OPERATION error is generated by CompressedTexImage3D if
internalformat is TEXTURE_CUBE_MAP_ARRAY and the “Cube Map Array”
column of table 8.17 is not checked, or if internalformat is
TEXTURE_3D and the “3D Tex.” column of table 8.17 is not checked.
So far it was only considering TEXTURE_2D_ARRAY as valid target. But as
"Cube Map Array" column is checked for all the cases, in practice we can
consider also TEXTURE_CUBE_MAP_ARRAY.
This fixes KHR-GLES32.core.texture_cube_map_array.etc2_texture
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Tapani Pälli [Mon, 20 Nov 2017 08:57:17 +0000 (10:57 +0200)]
intel: fix disasm_info memory leaks
Fixes: 4f82b1728719 ("i965: Rewrite disassembly annotation code")
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Timothy Arceri [Thu, 16 Nov 2017 00:16:10 +0000 (11:16 +1100)]
st/glsl_to_nir: don't generate nir twice for gs
This was left out of
c980a3aa3133
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Roland Scheidegger [Sat, 18 Nov 2017 05:23:35 +0000 (06:23 +0100)]
llvmpipe: fix snorm blending
The blend math gets a bit funky due to inverse blend factors being
in range [0,2] rather than [-1,1], our normalized math can't really
cover this.
src_alpha_saturate blend factor has a similar problem too.
(Note that piglit fbo-blending-formats test is mostly useless for
anything but unorm formats, since not just all src/dst values are
between [0,1], but the tests are crafted in a way that the results
are between [0,1] too.)
v2: some formatting fixes, and fix a fairly obscure (to debug)
issue with alpha-only formats (not related to snorm at all), where
blend optimization would think it could simplify the blend equation
if the blend factors were complementary, however was using the
completely unrelated rgb blend factors instead of the alpha ones...
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Dave Airlie [Fri, 13 May 2016 04:35:33 +0000 (14:35 +1000)]
r600: add cull distance support
This passes all the tests in piglit.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Aravindan Muthukumar [Thu, 9 Nov 2017 05:45:28 +0000 (11:15 +0530)]
i965: Optimize bucket index calculation
Reducing Bucket index calculation to O(1).
This algorithm calculates the index using matrix method. Assuming
PAGE_SIZE is 4096, matrix arrangement is as below:
1*4096 2*4096 3*4096 4*4096
5*4096 6*4096 7*4096 8*4096
10*4096 12*4096 14*4096 16*4096
20*4096 24*4096 28*4096 32*4096
... ... ... ...
... ... ... ...
... ... ... max_cache_size
From this matrix its clearly seen that every row follows the below way:
... ... ... n
n+(1/4)n n+(1/2)n n+(3/4)n 2n
Row is calculated as log2(size/PAGE_SIZE) Column is calculated as
converting the difference between the elements to fit into power size of
two and indexing it.
Final Index is (row*4)+(col-1)
Tested with Intel Mesa CI.
Improves performance of 3DMark on BXT by 0.705966% +/- 0.229767% (n=20)
v4: Review comments on style and code comments implemented (Ian).
v3: Review comments implemented (Ian).
v2: Review comments implemented (Jason).
Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
Signed-off-by: Kedar Karanje <kedar.j.karanje@intel.com>
Reviewed-by: Yogesh Marathe <yogesh.marathe@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Dylan Baker [Wed, 15 Nov 2017 01:04:27 +0000 (17:04 -0800)]
meson: Guard the gallium dri componenet
Currently the target has a redundant guard, and the state tracker isn't
properly guarded.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Dylan Baker [Wed, 15 Nov 2017 01:03:39 +0000 (17:03 -0800)]
meson: don't build gallium subdir unless we're building gallium
This will allow us to simplify some guards within the gallium directory.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Eric Anholt [Mon, 20 Nov 2017 18:14:38 +0000 (10:14 -0800)]
broadcom/vc5: Align 1D texture miplevels to 64b.
Fixes tex-miplevel-selection GL2:texture() 1D
Eric Anholt [Mon, 20 Nov 2017 18:07:24 +0000 (10:07 -0800)]
broadcom/vc5: Clamp min lod to the last level.
Otherwise, the simulator would complain in tex-miplevel-selection that the
min/max clamp was out of order. The actual HW seems to have clamped to
the max anyway.
Eric Anholt [Mon, 20 Nov 2017 20:26:49 +0000 (12:26 -0800)]
broadcom/vc5: Increase simulator memory for tex-miplevel-selection.
We were overflowing, because of all the little 4k allocations for CLs that
were getting expanded to 128kb in the simulator due to the GMP alignment.
Tim Rowley [Fri, 10 Nov 2017 22:45:38 +0000 (16:45 -0600)]
swr/rast: Repair simd8 frontend code rot
Keep non-default simd8 frontend code running for comparison purposes.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Thu, 9 Nov 2017 01:17:24 +0000 (19:17 -0600)]
swr/rast: Implement AVX-512 GATHERPS in SIMD16 fetch shader
Disabled for now.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Wed, 8 Nov 2017 20:07:33 +0000 (14:07 -0600)]
swr/rast: Simplify GATHER* jit builder api
General cleanup, and prep work for possibly moving to llvm masked
gather intrinsic.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Tue, 7 Nov 2017 21:24:25 +0000 (15:24 -0600)]
swr/rast: Add alignment to transpose targets
Needed to ensure alignment for avx512.
Fixes address sanitizer crash.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Tue, 7 Nov 2017 19:50:11 +0000 (13:50 -0600)]
swr/rast: Cache eventmanager
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Tue, 31 Oct 2017 21:46:59 +0000 (16:46 -0500)]
swr/rast: Enable AVX-512 targets in the jitter
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Tue, 31 Oct 2017 14:41:02 +0000 (09:41 -0500)]
swr/rast: Points with clipdistance can't go through simplepoints path
Fixes piglit glsl-1.20:vs-clip-vertex-primitives and
glsl-1.30:vs-clip-distance-primitives.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Mon, 23 Oct 2017 20:10:35 +0000 (15:10 -0500)]
swr/rast: Code style change (NFC)
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Thu, 19 Oct 2017 22:33:37 +0000 (17:33 -0500)]
swr/rast: Widen fetch shader to SIMD16
Widen fetch shader to SIMD16, enable SIMD16 types in the jitter,
and provide utility EXTRACT/INSERT SIMD8 <-> SIMD16 utility functions.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Wed, 18 Oct 2017 21:51:07 +0000 (16:51 -0500)]
swr/rast: Support flexible vertex layout for DS output
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Nicolai Hähnle [Fri, 10 Nov 2017 10:15:44 +0000 (11:15 +0100)]
gallium/u_threaded: avoid syncing in threaded_context_flush
We could always do the flush asynchronously, but if we're going to wait
for a fence anyway and the driver thread is currently idle, the additional
communication overhead isn't worth it.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Fri, 10 Nov 2017 09:58:10 +0000 (10:58 +0100)]
radeonsi: avoid syncing the driver thread in si_fence_finish
It is really only required when we need to flush for deferred fences.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Mon, 13 Nov 2017 13:50:17 +0000 (14:50 +0100)]
radeonsi: recompute the relative timeout after waiting for ready fence
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Fri, 10 Nov 2017 16:13:27 +0000 (17:13 +0100)]
ddebug: fix the hang detection timeout calculation
Fixes: c9fefa062b36 ("ddebug: rewrite to always use a threaded approach")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Fri, 10 Nov 2017 12:11:53 +0000 (13:11 +0100)]
ddebug: fix use-after-free of streamout targets
Fixes: b47727a83ad6 ("ddebug: implement pipelined hang detection mode")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Fri, 10 Nov 2017 10:28:28 +0000 (11:28 +0100)]
gallium/u_threaded: properly initialize fence unflushed tokens
This got lost in a rebase but never hurt anything because we happened
to always sync in fence_finish anyway...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Fri, 10 Nov 2017 11:32:44 +0000 (12:32 +0100)]
util/u_queue: really use futex-based fences
The relevant define changed in the final revision of the simple mutex
patch.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Mon, 13 Nov 2017 13:35:50 +0000 (14:35 +0100)]
util/u_queue: fix timeout handling in util_queue_fence_wait_timeout
Fixes: e3a8013de8ca ("util/u_queue: add util_queue_fence_wait_timeout")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Thu, 9 Nov 2017 13:34:20 +0000 (14:34 +0100)]
st/mesa: use asynchronous flushes in st_finish
With threaded gallium, the driver may currently be running in another
thread. In that case, we will execute all remaining commands in that
thread instead of syncing, which should be better for cache locality.
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Thu, 9 Nov 2017 13:34:19 +0000 (14:34 +0100)]
st/mesa: implement st_server_wait_sync properly
Asynchronous flushes require a proper implementation of
st_server_wait_sync, because we could have the following with
threaded Gallium:
Context 1 app Context 1 driver Context 2
------------- ---------------- ---------
f = glFenceSync
glFlush
<-- app sync --> <-- app sync -->
glWaitSync(f)
.. draw calls ..
pipe_context::flush
for glFenceSync
pipe_context::flush
for glFlush
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Mon, 6 Nov 2017 10:56:54 +0000 (11:56 +0100)]
u_threaded_gallium: remove synchronization in fence_server_sync
The whole point of fence_server_sync is that it can be used to
avoid waiting in the application thread.
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Wed, 15 Nov 2017 11:51:23 +0000 (12:51 +0100)]
amd: build addrlib with C++11
It is required for LLVM anyway.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103658
Fixes: 7f33e94e43a6 ("amd/addrlib: update to latest version")
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Wed, 15 Nov 2017 10:22:26 +0000 (11:22 +0100)]
radeonsi/gfx9: fix VM fault with fetched instance divisors
We need to account for SGPR locations in merged shaders.
This case is exercised by KHR-GL45.enhanced_layouts.vertex_attrib_locations
Fixes: 79c2e7388c7f ("radeonsi/gfx9: use SPI_SHADER_USER_DATA_COMMON")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Samuel Pitoiset [Wed, 15 Nov 2017 11:08:29 +0000 (12:08 +0100)]
radv: use a 16 bytes array for the sampled/storage image descriptors
This allows to update them with only one memcpy().
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Wed, 15 Nov 2017 09:55:05 +0000 (10:55 +0100)]
radv: do not add the query pool BO to the list in vkCmdEndQuery()
As per the spec, the query identified by queryPool and query
must currently be active. Applications have to call vkCmdBeginQuery()
before, and thus the query pool BO will already be in the list.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Wed, 15 Nov 2017 14:44:01 +0000 (15:44 +0100)]
radv: only load needed depth clear regs for fast depth clears
Similar to how the driver sets the depth clear regs after a
fast depth clear. Most of the time, this will copy a 32-bit reg
instead of a 64-bit reg.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 15 Nov 2017 14:44:00 +0000 (15:44 +0100)]
radv: do not add the image BO in radv_set_depth_clear_regs()
For the fast path, radv_fill_buffer() ensures that the BO is
already in the list. For the slow path, the depth surface is
part of the framebuffer which means the BO is added to the list
when the framebuffer is emitted.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 15 Nov 2017 14:43:59 +0000 (15:43 +0100)]
radv: remove useless assertion in emit_depthstencil_clear()
Already checked in emit_clear().
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 15 Nov 2017 14:43:58 +0000 (15:43 +0100)]
radv: remove useless check in radv_set_depth_clear_regs()
aspects can't be zero and there is an assertion that ensures
it's not in emit_clear().
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Dave Airlie [Sun, 19 Nov 2017 23:19:31 +0000 (09:19 +1000)]
docs/features: mark some r600 extensions supported
These just looked to be missed when this file was updated.
Signed-off-by: Dave Airlie <airlied@redhat.com>
George Barrett [Sun, 19 Nov 2017 10:55:10 +0000 (21:55 +1100)]
glsl: Catch subscripted calls to undeclared subroutines
generate_array_index fails to check whether the target of a subroutine
call exists in the AST, potentially passing around null ir_rvalue
pointers eventuating in abort/segfault.
Fixes: fd01840c0bd3 ("glsl: add AoA support to subroutines")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100438
Eric Anholt [Fri, 13 Oct 2017 20:11:15 +0000 (13:11 -0700)]
broadcom/vc5: Fix up integer texture handling.
The original spec I had didn't expose integer textures and suggested that
you use unfiltered floats. Now there are proper formats for them.
Fixes 16- and 32-bit texwrap integer tests in piglit, and
dEQP-GLES3.functional.fbo.completeness.renderable.renderbuffer.color0.rgb10_a2ui.
Eric Anholt [Fri, 17 Nov 2017 01:50:55 +0000 (17:50 -0800)]
broadcom/vc5: Fix simulator assertion failures about color RT clears.
When we tried to clear color while storing depth, it assertion failed
about basically not having enough information to decide which color RT to
clear. It turns out the STORE_GENERAL picks the buffer according to the
color buffer being stored, or all of them if NONE. If you're doing depth,
it doesn't know which to pick.
Rob Clark [Sat, 18 Nov 2017 15:40:49 +0000 (10:40 -0500)]
freedreno/ir3: add texture gather support
Signed-off-by: Rob Clark <robdclark@gmail.com>
Lucas Stach [Wed, 15 Nov 2017 16:33:17 +0000 (17:33 +0100)]
etnaviv: enable full overwrite when no color buffer is present
The OVERWRITE bit disables destination fetches, which is exactly what
we want when there is no valid color buffer bound.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
Jason Ekstrand [Sat, 18 Nov 2017 01:27:55 +0000 (17:27 -0800)]
i965: Stop including brw_cfg.h in brw_disasm_info.h
The brw_disasm_info header is included by certain tools in order to get
shader assembly from binaries so it's a semi-external header. Including
brw_cfg.h also pulls in brw_shader.h so you end up getting quite a bit
of our back-end compiler internals. Instead, make the couple of forward
declarations we need and make the header more stand-alone. This fixes
the meson build.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 4f82b17287194ca7d10816f6cfe4712a3e0a03fc
Jason Ekstrand [Sat, 18 Nov 2017 00:52:09 +0000 (16:52 -0800)]
i965: Mark BOs as external when we export their handle
Almost all of our BO export paths were already properly marked the BO as
external and added it to the handle table. Most export use-cases go
through a prime fd or flink where we have a brw_bo export helper that
does the right thing. The one missing one happens when you call
queryImage and ask for __DRI_IMAGE_ATTRIB_HANDLE. We just grabbed the
gem handle out of the BO (because it's really easy to do that) and
handed it off to the client; what could go wrong? As it turns out, this
path is used by basically every compositor that wants to turn around and
call drmModeAddFB2 on it so it can hand it off to display. The result,
as of
4b1e70cc57d7ff5f465544644b2180dee1490cee, is that we no longer set
MOCS_PTE on those surfaces and the kernel's attempts to disable caching
fail and we scanout gets corruption.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103759
Fixes: 4b1e70cc57d7ff5f465544644b2180dee1490cee
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
Jason Ekstrand [Sat, 18 Nov 2017 00:49:03 +0000 (16:49 -0800)]
i965/bufmgr: Add a helper to mark a BO as external
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
Andres Gomez [Sat, 18 Nov 2017 00:48:45 +0000 (02:48 +0200)]
i965: Correct disasm_info usage in eu_validate test
Fixes: 4f82b1728719 ("i965: Rewrite disassembly annotation code")
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Eric Anholt [Tue, 14 Nov 2017 23:52:53 +0000 (15:52 -0800)]
broadcom/vc5: Set up the padded height at surface creation time.
This centralizes the calculation in the surface, instead of in each
load/store.
Eric Anholt [Wed, 15 Nov 2017 23:05:37 +0000 (15:05 -0800)]
broadcom/vc5: Ensure that there is always a TLB write.
This should fix some GPU hangs in our (currently always single-threaded)
fragment shaders, and definitely fixes assertion failures in simulation.
Eric Anholt [Tue, 7 Nov 2017 23:42:04 +0000 (15:42 -0800)]
broadcom/vc5: Fix clear color for swap_color_rb render targets.
Fixes dEQP-GLES3.functional.depth_stencil_clear.depth.*
Eric Anholt [Tue, 7 Nov 2017 23:37:46 +0000 (15:37 -0800)]
broadcom/vc5: Fix pasteo in front stencil ref value setup.
Fixes piglit masked-clear.
Eric Anholt [Tue, 7 Nov 2017 23:35:33 +0000 (15:35 -0800)]
broadcom/vc5: Fix colormasking when we need to swap r/b colors.
Fixes part of piglit masked-clear.
Eric Anholt [Tue, 7 Nov 2017 23:21:06 +0000 (15:21 -0800)]
broadcom/vc5: Enable the Z min/max clipping planes.
Eric Anholt [Wed, 15 Nov 2017 00:01:32 +0000 (16:01 -0800)]
broadcom/vc5: Fix driver for new PIPE_SHADER_CAP_MAX_HW_ATOMIC_*.
Brian Paul [Fri, 17 Nov 2017 16:38:39 +0000 (09:38 -0700)]
r300: add PIPE_SHADER_CAP_MAX_HW_ATOMIC_COUNTER* switch cases
To silence compiler warnings.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Brian Paul [Fri, 17 Nov 2017 22:03:21 +0000 (15:03 -0700)]
tgsi: s/uint/enum pipe_shader_type/
Roland Scheidegger <sroland@vmware.com>
Brian Paul [Fri, 17 Nov 2017 16:51:10 +0000 (09:51 -0700)]
tgsi: bump tgsi_opcode_info::output_mode size to 4 bits
To avoid problems with MSVC. And verify size with ASSERT_BITFIELD_SIZE().
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Kenneth Graunke [Fri, 17 Nov 2017 06:31:27 +0000 (22:31 -0800)]
i965: Revert Gen8 aspect of VF PIPE_CONTROL workaround.
This apparently causes hangs on Broadwell, so let's back it out for now.
I think there are other PIPE_CONTROL workarounds that we're missing.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103787
Adam Jackson [Thu, 16 Nov 2017 18:27:27 +0000 (13:27 -0500)]
egl: Convert int to attrib in eglGetPlatformDisplay
... because converting attrib to int truncates, and that's bad.
Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Rob Clark [Fri, 17 Nov 2017 20:18:14 +0000 (15:18 -0500)]
docs: update features for freedreno
Just comparing glxinfo and features.txt, and it seems features.txt is
fairly out of date. The a5xx specific features (compute/images/atomics/
etc) are recent.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Matt Turner [Thu, 16 Nov 2017 19:43:51 +0000 (11:43 -0800)]
i965: Rename intel_asm_annotation -> brw_disasm_info
It was the only file named intel_* in the compiler.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Thu, 16 Nov 2017 01:08:42 +0000 (17:08 -0800)]
i965: Rewrite disassembly annotation code
The old code used an array to store each "instruction group" (the new,
better name than the old overloaded "annotation"), and required a
memmove() to shift elements over in the array when we needed to split a
group so that we could add an error message. This was confusing and
difficult to get right, not the least of which was because the array
has a tail sentinel not included in .ann_count.
Instead use a linked list, a data structure made for efficient
insertion.
Acked-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Thu, 16 Nov 2017 21:42:41 +0000 (13:42 -0800)]
i965: Simplify annotation_insert_error()
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Thu, 16 Nov 2017 21:35:01 +0000 (13:35 -0800)]
i965: Move common code out of #ifdef
I'm going to change the call in a later patch and with the difference in
indentation level it wasn't immediately obvious that the calls were
identical.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Anuj Phogat [Tue, 14 Nov 2017 22:48:21 +0000 (14:48 -0800)]
i965: Remove DWord length from MI_FLUSH_DW definition
Fixes: 6165fda59b8 ("i965: Program DWord Length in MI_FLUSH_DW")
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Sat, 11 Nov 2017 19:52:41 +0000 (11:52 -0800)]
anv/cmd_buffer: Take bo_offset into account in fast clear state addresses
Otherwise, if the image is not bound to the start of the buffer, we're
going to be reading and writing its fast clear state in the wrong spot.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Jason Ekstrand [Sun, 12 Nov 2017 06:03:45 +0000 (22:03 -0800)]
anv/cmd_buffer: Advance the address when initializing clear colors
Found by inspection
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Boyuan Zhang [Tue, 7 Nov 2017 21:25:09 +0000 (16:25 -0500)]
radeon/video: enable encode support for raven
Enable h.264 encode for vcn hardware (raven)
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Boyuan Zhang [Tue, 7 Nov 2017 21:24:10 +0000 (16:24 -0500)]
radeonsi: enable vcn encode
Enable vcn encode by creating radeon_encoder for vcn.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Boyuan Zhang [Wed, 8 Nov 2017 16:24:09 +0000 (11:24 -0500)]
radeon/vcn: add create encoder
Add implementation for create_encoder interface for vcn encode.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Boyuan Zhang [Tue, 7 Nov 2017 21:21:21 +0000 (16:21 -0500)]
radeon/vcn: add encode get feedback
Add implementation for get_feedback interface for vcn encode.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Boyuan Zhang [Tue, 7 Nov 2017 21:20:53 +0000 (16:20 -0500)]
radeon/vcn: add encode destroy
Add implementation for destroy interface for vcn encode.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Boyuan Zhang [Tue, 7 Nov 2017 21:20:25 +0000 (16:20 -0500)]
radeon/vcn: add encode end frame
Add implementation for end_frame interface for vcn encode.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Boyuan Zhang [Tue, 7 Nov 2017 21:20:05 +0000 (16:20 -0500)]
radeon/vcn: add encode bitstream
Add implementation for encode_bitstream interface for vcn encode.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Boyuan Zhang [Tue, 7 Nov 2017 21:19:22 +0000 (16:19 -0500)]
radeon/vcn: add encode begin frame
Add implementation for begin_frame interface for vcn encode.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Boyuan Zhang [Fri, 10 Nov 2017 23:33:32 +0000 (18:33 -0500)]
radeon/vcn: add encode header implementations
Implement encoding of sps, pps, and silce headers using the newly added h.264
header coding descriptors functions based on h.264 specs.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Boyuan Zhang [Tue, 7 Nov 2017 21:17:25 +0000 (16:17 -0500)]
radeon/vcn: add encode header algorithms
Since bitstream headers, e.g. sps, pps, slice, are encoded in driver side, we
need to add corresponding algorithms that required to generate those headers.
According to h.264 specs, signed/unsigned interger Exp-Golomb-coded syntax
element with left bit first (code_se and code_ue) and unsigned integer using
n bits (code_fixed_bits) descriptors function are needed. Therefore, adding
those algorithms and related variables and output algorithms here.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Boyuan Zhang [Tue, 7 Nov 2017 21:09:52 +0000 (16:09 -0500)]
radeon/vcn: add ib implementations
Implement required ibs and command buffer submission interfaces for vcn encode
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Boyuan Zhang [Wed, 8 Nov 2017 16:17:15 +0000 (11:17 -0500)]
radeon/vcn: add common encode part
Add a skeleton pipe video interface and encode ib interface for video encode
on vcn hardware. Add function defines and structures for vcn encode. Update
Makefile.sources and meson.build with newly added files.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Boyuan Zhang [Tue, 7 Nov 2017 20:53:35 +0000 (15:53 -0500)]
st/va: implement poc type
pic_order_cnt_type is a required variable when encoding both sps and
slice header, therefore we need to get this value from st, e.g. vaapi
interface, and then pass it to radeon driver for encoding headers.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>