git.libre-soc.org Git - mesa.git/log

Dylan Baker [Thu, 16 Nov 2017 01:09:33 +0000 (17:09 -0800)]

meson: Remove completed or irrelevant TODO comments

These are all either done already, or are autotools specific. The
misspelled gallium G3DVL is the autotools specific bit, meson is
handling that via build_by_default.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>

commit | commitdiff | tree

Dylan Baker [Thu, 16 Nov 2017 01:07:37 +0000 (17:07 -0800)]

meson: Fix TODO for missing dl_iterate_phdr function

This function is required for both the Intel "Anvil" vulkan driver and
the i965 GL driver. Error out if either of those is enabled but this
function isn't found.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>

commit | commitdiff | tree

Dylan Baker [Thu, 16 Nov 2017 00:53:40 +0000 (16:53 -0800)]

meson: disable x86 asm in fewer cases.

This patch allows building asm for x86 on x86_64 platforms, when the
operating system is the same. Previously cross compile always turned off
assembly. This allows using a cross file to cross compile x86 binaries
on x86_64 with asm.

This could probably be relaxed further thanks to meson's "exe_wrapper",
which is way to specify an emulator or compatibility layer (wine) that
can run the foreign binaries on the build system. Since the meson build
at this point only supports building on Linux I can't test this and I
don't want to write/enable code that cannot even be build tested.

v4: - set condition to build == x86_64 and host == x86 and
build.system == host.system

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>

commit | commitdiff | tree

Dylan Baker [Thu, 16 Nov 2017 00:09:22 +0000 (16:09 -0800)]

meson: Enable SSE4.1 optimizations

This patch checks for an and then enables sse4.1 optimizations if the
host machine will be x86/x86_64.

v2: - Don't compile code, it's unnecessary since we require a compiler
which always has SSE4.1 (Matt)
v3: - x64 -> x86_64 (Matt)

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>

commit | commitdiff | tree

Eric Anholt [Wed, 22 Nov 2017 00:32:33 +0000 (16:32 -0800)]

broadcom/vc5: Fix BASE_LEVEL handling with txl.

The HW doesn't add the base level anywhere (the min/max lod clamping is
what does base level), so we need to add it manually in this case.

Fixes piglit tex-miplevel-selection *Lod 2D.

commit | commitdiff | tree

Eric Anholt [Wed, 22 Nov 2017 00:05:49 +0000 (16:05 -0800)]

broadcom/vc5: Fix array texture layer count setup.

Fixes piglit array-texture.

commit | commitdiff | tree

Eric Anholt [Tue, 21 Nov 2017 23:27:20 +0000 (15:27 -0800)]

broadcom/vc5: Don't increment primitive queries while they're paused.

Fixes ext_transform_feedback-generatemipmap prims_generated

commit | commitdiff | tree

Eric Anholt [Tue, 21 Nov 2017 23:20:31 +0000 (15:20 -0800)]

broadcom/vc5: Fix incorrect padding of TF outputs.

After the first output, we were padding by an extra size of the previous
output. Fixes piglit ext_transform_feedback-output-type mat4x3[2] and
friends.

commit | commitdiff | tree

Eric Anholt [Tue, 21 Nov 2017 23:00:36 +0000 (15:00 -0800)]

broadcom/vc5: Fix UIF surface size setup for ARB_fbo's mismatched sizes.

The HW was computing an implicit height for the surface based on the image
size, but that may be smaller than the surface with ARB_fbo mismatched
sizes. In that case, we need to tell it about the pad, either with the
little 4-bit field in the RT config, or the extended field in
CLEAR_COLORS_PART3.

Fixes piglit arb_framebuffer_object-mixed-buffer-sizes.

commit | commitdiff | tree

Wladimir J. van der Laan [Sat, 18 Nov 2017 09:44:25 +0000 (10:44 +0100)]

etnaviv: Put HALTI level in specs

The HALTI level is an indication of the gross architecture of the GPU.
It determines for significant part what feature level the GPU has, what
state (especially frontend state) is there, and where it is located.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>

commit | commitdiff | tree

Wladimir J. van der Laan [Sat, 18 Nov 2017 09:44:24 +0000 (10:44 +0100)]

etnaviv: Const-correctness etnaviv_emit.h

The relocation structure is never changed by submitting it.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>

commit | commitdiff | tree

Juan A. Suarez Romero [Tue, 21 Nov 2017 11:38:27 +0000 (12:38 +0100)]

meson: add si_driinfo.h in libgallium_dri

v2: generate target conditionally (Dylan)

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>

commit | commitdiff | tree

Iago Toral Quiroga [Thu, 16 Nov 2017 07:53:07 +0000 (08:53 +0100)]

nir/gather_info: recognize load_patch_vertices_in as a system value

This intrinsic is produced to load SYSTEM_VALUE_VERTICES_IN, which is
generated to load gl_PatchVerticesIn in the SPIR-V path for both
Vulkan and OpenGL.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Jordan Justen [Wed, 15 Nov 2017 00:27:34 +0000 (16:27 -0800)]

i965: Support decoding INTERFACE_DESCRIPTOR_DATA with INTEL_DEBUG=bat

This will dump the INTERFACE_DESCRIPTOR_DATA along with the associated
samplers & surfaces.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>

commit | commitdiff | tree

Kristian H. Kristensen [Wed, 30 Nov 2016 05:07:57 +0000 (21:07 -0800)]

intel/genxml: Add helpers for determining field type

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

commit | commitdiff | tree

Matt Turner [Mon, 20 Nov 2017 22:21:43 +0000 (14:21 -0800)]

i965/fs: Check ADD/MAD with immediates in satprop unit test

The gen had to be changed from 4 to 6 so that we could test MAD, which
is new on Gen6.

mad_imm_float_neg_mov_sat tests the case fixed by the previous commit.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

commit | commitdiff | tree

Matt Turner [Mon, 20 Nov 2017 22:24:57 +0000 (14:24 -0800)]

i965/fs: Handle negating immediates on MADs when propagating saturates

MADs don't take immediate sources, but we allow them in the IR since it
simplifies a lot of things. I neglected to consider that case.

Fixes: 4009a9ead490 ("i965/fs: Allow saturate propagation to propagate
negations into MADs.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103616
Reported-and-Tested-by: Ruslan Kabatsayev <b7.10110111@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

commit | commitdiff | tree

Juan A. Suarez Romero [Wed, 15 Nov 2017 16:49:21 +0000 (16:49 +0000)]

mesa/teximage: add TEXTURE_CUBE_MAP_ARRAY target for CompressedTexImage3D

From section 8.7, page 179 of OpenGL ES 3.2 spec:

  An INVALID_OPERATION error is generated by CompressedTexImage3D
  if internalformat is one of the the formats in table 8.17 and target
  is not TEXTURE_2D_ARRAY, TEXTURE_CUBE_MAP_ARRAY or TEXTURE_3D.

  An INVALID_OPERATION error is generated by CompressedTexImage3D if
  internalformat is TEXTURE_CUBE_MAP_ARRAY and the “Cube Map Array”
  column of table 8.17 is not checked, or if internalformat is
  TEXTURE_3D and the “3D Tex.” column of table 8.17 is not checked.

So far it was only considering TEXTURE_2D_ARRAY as valid target. But as
"Cube Map Array" column is checked for all the cases, in practice we can
consider also TEXTURE_CUBE_MAP_ARRAY.

This fixes KHR-GLES32.core.texture_cube_map_array.etc2_texture

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>

commit | commitdiff | tree

Tapani Pälli [Mon, 20 Nov 2017 08:57:17 +0000 (10:57 +0200)]

intel: fix disasm_info memory leaks

Fixes: 4f82b1728719 ("i965: Rewrite disassembly annotation code")
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Timothy Arceri [Thu, 16 Nov 2017 00:16:10 +0000 (11:16 +1100)]

st/glsl_to_nir: don't generate nir twice for gs

This was left out of c980a3aa3133

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Roland Scheidegger [Sat, 18 Nov 2017 05:23:35 +0000 (06:23 +0100)]

llvmpipe: fix snorm blending

The blend math gets a bit funky due to inverse blend factors being
in range [0,2] rather than [-1,1], our normalized math can't really
cover this.
src_alpha_saturate blend factor has a similar problem too.
(Note that piglit fbo-blending-formats test is mostly useless for
anything but unorm formats, since not just all src/dst values are
between [0,1], but the tests are crafted in a way that the results
are between [0,1] too.)

v2: some formatting fixes, and fix a fairly obscure (to debug)
issue with alpha-only formats (not related to snorm at all), where
blend optimization would think it could simplify the blend equation
if the blend factors were complementary, however was using the
completely unrelated rgb blend factors instead of the alpha ones...

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

commit | commitdiff | tree

Dave Airlie [Fri, 13 May 2016 04:35:33 +0000 (14:35 +1000)]

r600: add cull distance support

This passes all the tests in piglit.

Signed-off-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Aravindan Muthukumar [Thu, 9 Nov 2017 05:45:28 +0000 (11:15 +0530)]

i965: Optimize bucket index calculation

Reducing Bucket index calculation to O(1).

This algorithm calculates the index using matrix method.  Assuming
PAGE_SIZE is 4096, matrix arrangement is as below:

          1*4096   2*4096    3*4096    4*4096
          5*4096   6*4096    7*4096    8*4096
          10*4096  12*4096   14*4096   16*4096
          20*4096  24*4096   28*4096   32*4096
           ...      ...       ...       ...
           ...      ...       ...       ...
           ...      ...       ...   max_cache_size

From this matrix its clearly seen that every row follows the below way:

          ...       ...       ...        n
        n+(1/4)n  n+(1/2)n  n+(3/4)n    2n

Row is calculated as log2(size/PAGE_SIZE) Column is calculated as
converting the difference between the elements to fit into power size of
two and indexing it.

Final Index is (row*4)+(col-1)

Tested with Intel Mesa CI.

Improves performance of 3DMark on BXT by 0.705966% +/- 0.229767% (n=20)

v4: Review comments on style and code comments implemented (Ian).
v3: Review comments implemented (Ian).
v2: Review comments implemented (Jason).

Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
Signed-off-by: Kedar Karanje <kedar.j.karanje@intel.com>
Reviewed-by: Yogesh Marathe <yogesh.marathe@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>

commit | commitdiff | tree

Dylan Baker [Wed, 15 Nov 2017 01:04:27 +0000 (17:04 -0800)]

meson: Guard the gallium dri componenet

Currently the target has a redundant guard, and the state tracker isn't
properly guarded.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>

commit | commitdiff | tree

Dylan Baker [Wed, 15 Nov 2017 01:03:39 +0000 (17:03 -0800)]

meson: don't build gallium subdir unless we're building gallium

This will allow us to simplify some guards within the gallium directory.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>

commit | commitdiff | tree

Eric Anholt [Mon, 20 Nov 2017 18:14:38 +0000 (10:14 -0800)]

broadcom/vc5: Align 1D texture miplevels to 64b.

Fixes tex-miplevel-selection GL2:texture() 1D

commit | commitdiff | tree

Eric Anholt [Mon, 20 Nov 2017 18:07:24 +0000 (10:07 -0800)]

broadcom/vc5: Clamp min lod to the last level.

Otherwise, the simulator would complain in tex-miplevel-selection that the
min/max clamp was out of order. The actual HW seems to have clamped to
the max anyway.

commit | commitdiff | tree

Eric Anholt [Mon, 20 Nov 2017 20:26:49 +0000 (12:26 -0800)]

broadcom/vc5: Increase simulator memory for tex-miplevel-selection.

We were overflowing, because of all the little 4k allocations for CLs that
were getting expanded to 128kb in the simulator due to the GMP alignment.

commit | commitdiff | tree

Tim Rowley [Fri, 10 Nov 2017 22:45:38 +0000 (16:45 -0600)]

swr/rast: Repair simd8 frontend code rot

Keep non-default simd8 frontend code running for comparison purposes.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Thu, 9 Nov 2017 01:17:24 +0000 (19:17 -0600)]

swr/rast: Implement AVX-512 GATHERPS in SIMD16 fetch shader

Disabled for now.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Wed, 8 Nov 2017 20:07:33 +0000 (14:07 -0600)]

swr/rast: Simplify GATHER* jit builder api

General cleanup, and prep work for possibly moving to llvm masked
gather intrinsic.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Tue, 7 Nov 2017 21:24:25 +0000 (15:24 -0600)]

swr/rast: Add alignment to transpose targets

Needed to ensure alignment for avx512.

Fixes address sanitizer crash.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Tue, 7 Nov 2017 19:50:11 +0000 (13:50 -0600)]

swr/rast: Cache eventmanager

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Tue, 31 Oct 2017 21:46:59 +0000 (16:46 -0500)]

swr/rast: Enable AVX-512 targets in the jitter

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Tue, 31 Oct 2017 14:41:02 +0000 (09:41 -0500)]

swr/rast: Points with clipdistance can't go through simplepoints path

Fixes piglit glsl-1.20:vs-clip-vertex-primitives and
glsl-1.30:vs-clip-distance-primitives.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Mon, 23 Oct 2017 20:10:35 +0000 (15:10 -0500)]

swr/rast: Code style change (NFC)

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Thu, 19 Oct 2017 22:33:37 +0000 (17:33 -0500)]

swr/rast: Widen fetch shader to SIMD16

Widen fetch shader to SIMD16, enable SIMD16 types in the jitter,
and provide utility EXTRACT/INSERT SIMD8 <-> SIMD16 utility functions.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Wed, 18 Oct 2017 21:51:07 +0000 (16:51 -0500)]

swr/rast: Support flexible vertex layout for DS output

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Nicolai Hähnle [Fri, 10 Nov 2017 10:15:44 +0000 (11:15 +0100)]

gallium/u_threaded: avoid syncing in threaded_context_flush

We could always do the flush asynchronously, but if we're going to wait
for a fence anyway and the driver thread is currently idle, the additional
communication overhead isn't worth it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Fri, 10 Nov 2017 09:58:10 +0000 (10:58 +0100)]

radeonsi: avoid syncing the driver thread in si_fence_finish

It is really only required when we need to flush for deferred fences.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Mon, 13 Nov 2017 13:50:17 +0000 (14:50 +0100)]

radeonsi: recompute the relative timeout after waiting for ready fence

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Fri, 10 Nov 2017 16:13:27 +0000 (17:13 +0100)]

ddebug: fix the hang detection timeout calculation

Fixes: c9fefa062b36 ("ddebug: rewrite to always use a threaded approach")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Fri, 10 Nov 2017 12:11:53 +0000 (13:11 +0100)]

ddebug: fix use-after-free of streamout targets

Fixes: b47727a83ad6 ("ddebug: implement pipelined hang detection mode")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Fri, 10 Nov 2017 10:28:28 +0000 (11:28 +0100)]

gallium/u_threaded: properly initialize fence unflushed tokens

This got lost in a rebase but never hurt anything because we happened
to always sync in fence_finish anyway...

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Fri, 10 Nov 2017 11:32:44 +0000 (12:32 +0100)]

util/u_queue: really use futex-based fences

The relevant define changed in the final revision of the simple mutex
patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Mon, 13 Nov 2017 13:35:50 +0000 (14:35 +0100)]

util/u_queue: fix timeout handling in util_queue_fence_wait_timeout

Fixes: e3a8013de8ca ("util/u_queue: add util_queue_fence_wait_timeout")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Thu, 9 Nov 2017 13:34:20 +0000 (14:34 +0100)]

st/mesa: use asynchronous flushes in st_finish

With threaded gallium, the driver may currently be running in another
thread. In that case, we will execute all remaining commands in that
thread instead of syncing, which should be better for cache locality.

Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Thu, 9 Nov 2017 13:34:19 +0000 (14:34 +0100)]

st/mesa: implement st_server_wait_sync properly

Asynchronous flushes require a proper implementation of
st_server_wait_sync, because we could have the following with
threaded Gallium:

Context 1 app     Context 1 driver         Context 2
-------------     ----------------         ---------
f = glFenceSync
glFlush
<-- app sync -->                           <-- app sync -->
                                            glWaitSync(f)
                                            .. draw calls ..
                   pipe_context::flush
                     for glFenceSync
                   pipe_context::flush
                     for glFlush

Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Mon, 6 Nov 2017 10:56:54 +0000 (11:56 +0100)]

u_threaded_gallium: remove synchronization in fence_server_sync

The whole point of fence_server_sync is that it can be used to
avoid waiting in the application thread.

Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Wed, 15 Nov 2017 11:51:23 +0000 (12:51 +0100)]

amd: build addrlib with C++11

It is required for LLVM anyway.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103658
Fixes: 7f33e94e43a6 ("amd/addrlib: update to latest version")
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Wed, 15 Nov 2017 10:22:26 +0000 (11:22 +0100)]

radeonsi/gfx9: fix VM fault with fetched instance divisors

We need to account for SGPR locations in merged shaders.

This case is exercised by KHR-GL45.enhanced_layouts.vertex_attrib_locations

Fixes: 79c2e7388c7f ("radeonsi/gfx9: use SPI_SHADER_USER_DATA_COMMON")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Samuel Pitoiset [Wed, 15 Nov 2017 11:08:29 +0000 (12:08 +0100)]

radv: use a 16 bytes array for the sampled/storage image descriptors

This allows to update them with only one memcpy().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Samuel Pitoiset [Wed, 15 Nov 2017 09:55:05 +0000 (10:55 +0100)]

radv: do not add the query pool BO to the list in vkCmdEndQuery()

As per the spec, the query identified by queryPool and query
must currently be active. Applications have to call vkCmdBeginQuery()
before, and thus the query pool BO will already be in the list.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Samuel Pitoiset [Wed, 15 Nov 2017 14:44:01 +0000 (15:44 +0100)]

radv: only load needed depth clear regs for fast depth clears

Similar to how the driver sets the depth clear regs after a
fast depth clear. Most of the time, this will copy a 32-bit reg
instead of a 64-bit reg.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Wed, 15 Nov 2017 14:44:00 +0000 (15:44 +0100)]

radv: do not add the image BO in radv_set_depth_clear_regs()

For the fast path, radv_fill_buffer() ensures that the BO is
already in the list. For the slow path, the depth surface is
part of the framebuffer which means the BO is added to the list
when the framebuffer is emitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Wed, 15 Nov 2017 14:43:59 +0000 (15:43 +0100)]

radv: remove useless assertion in emit_depthstencil_clear()

Already checked in emit_clear().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Wed, 15 Nov 2017 14:43:58 +0000 (15:43 +0100)]

radv: remove useless check in radv_set_depth_clear_regs()

aspects can't be zero and there is an assertion that ensures
it's not in emit_clear().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Dave Airlie [Sun, 19 Nov 2017 23:19:31 +0000 (09:19 +1000)]

docs/features: mark some r600 extensions supported

These just looked to be missed when this file was updated.

Signed-off-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

George Barrett [Sun, 19 Nov 2017 10:55:10 +0000 (21:55 +1100)]

glsl: Catch subscripted calls to undeclared subroutines

generate_array_index fails to check whether the target of a subroutine
call exists in the AST, potentially passing around null ir_rvalue
pointers eventuating in abort/segfault.

Fixes: fd01840c0bd3 ("glsl: add AoA support to subroutines")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100438

commit | commitdiff | tree

Eric Anholt [Fri, 13 Oct 2017 20:11:15 +0000 (13:11 -0700)]

broadcom/vc5: Fix up integer texture handling.

The original spec I had didn't expose integer textures and suggested that
you use unfiltered floats. Now there are proper formats for them.

Fixes 16- and 32-bit texwrap integer tests in piglit, and
dEQP-GLES3.functional.fbo.completeness.renderable.renderbuffer.color0.rgb10_a2ui.

commit | commitdiff | tree

Eric Anholt [Fri, 17 Nov 2017 01:50:55 +0000 (17:50 -0800)]

broadcom/vc5: Fix simulator assertion failures about color RT clears.

When we tried to clear color while storing depth, it assertion failed
about basically not having enough information to decide which color RT to
clear. It turns out the STORE_GENERAL picks the buffer according to the
color buffer being stored, or all of them if NONE. If you're doing depth,
it doesn't know which to pick.

commit | commitdiff | tree

Rob Clark [Sat, 18 Nov 2017 15:40:49 +0000 (10:40 -0500)]

freedreno/ir3: add texture gather support

Signed-off-by: Rob Clark <robdclark@gmail.com>

commit | commitdiff | tree

Lucas Stach [Wed, 15 Nov 2017 16:33:17 +0000 (17:33 +0100)]

etnaviv: enable full overwrite when no color buffer is present

The OVERWRITE bit disables destination fetches, which is exactly what
we want when there is no valid color buffer bound.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Sat, 18 Nov 2017 01:27:55 +0000 (17:27 -0800)]

i965: Stop including brw_cfg.h in brw_disasm_info.h

The brw_disasm_info header is included by certain tools in order to get
shader assembly from binaries so it's a semi-external header.  Including
brw_cfg.h also pulls in brw_shader.h so you end up getting quite a bit
of our back-end compiler internals.  Instead, make the couple of forward
declarations we need and make the header more stand-alone.  This fixes
the meson build.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 4f82b17287194ca7d10816f6cfe4712a3e0a03fc

commit | commitdiff | tree

Jason Ekstrand [Sat, 18 Nov 2017 00:52:09 +0000 (16:52 -0800)]

i965: Mark BOs as external when we export their handle

Almost all of our BO export paths were already properly marked the BO as
external and added it to the handle table.  Most export use-cases go
through a prime fd or flink where we have a brw_bo export helper that
does the right thing.  The one missing one happens when you call
queryImage and ask for __DRI_IMAGE_ATTRIB_HANDLE.  We just grabbed the
gem handle out of the BO (because it's really easy to do that) and
handed it off to the client; what could go wrong?  As it turns out, this
path is used by basically every compositor that wants to turn around and
call drmModeAddFB2 on it so it can hand it off to display.  The result,
as of 4b1e70cc57d7ff5f465544644b2180dee1490cee, is that we no longer set
MOCS_PTE on those surfaces and the kernel's attempts to disable caching
fail and we scanout gets corruption.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103759
Fixes: 4b1e70cc57d7ff5f465544644b2180dee1490cee
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org

commit | commitdiff | tree

Jason Ekstrand [Sat, 18 Nov 2017 00:49:03 +0000 (16:49 -0800)]

i965/bufmgr: Add a helper to mark a BO as external

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org

commit | commitdiff | tree

Andres Gomez [Sat, 18 Nov 2017 00:48:45 +0000 (02:48 +0200)]

i965: Correct disasm_info usage in eu_validate test

Fixes: 4f82b1728719 ("i965: Rewrite disassembly annotation code")
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

commit | commitdiff | tree

Eric Anholt [Tue, 14 Nov 2017 23:52:53 +0000 (15:52 -0800)]

broadcom/vc5: Set up the padded height at surface creation time.

This centralizes the calculation in the surface, instead of in each
load/store.

commit | commitdiff | tree

Eric Anholt [Wed, 15 Nov 2017 23:05:37 +0000 (15:05 -0800)]

broadcom/vc5: Ensure that there is always a TLB write.

This should fix some GPU hangs in our (currently always single-threaded)
fragment shaders, and definitely fixes assertion failures in simulation.

commit | commitdiff | tree

Eric Anholt [Tue, 7 Nov 2017 23:42:04 +0000 (15:42 -0800)]

broadcom/vc5: Fix clear color for swap_color_rb render targets.

Fixes dEQP-GLES3.functional.depth_stencil_clear.depth.*

commit | commitdiff | tree

Eric Anholt [Tue, 7 Nov 2017 23:37:46 +0000 (15:37 -0800)]

broadcom/vc5: Fix pasteo in front stencil ref value setup.

Fixes piglit masked-clear.

commit | commitdiff | tree

Eric Anholt [Tue, 7 Nov 2017 23:35:33 +0000 (15:35 -0800)]

broadcom/vc5: Fix colormasking when we need to swap r/b colors.

Fixes part of piglit masked-clear.

commit | commitdiff | tree

Eric Anholt [Tue, 7 Nov 2017 23:21:06 +0000 (15:21 -0800)]

broadcom/vc5: Enable the Z min/max clipping planes.

commit | commitdiff | tree

Eric Anholt [Wed, 15 Nov 2017 00:01:32 +0000 (16:01 -0800)]

broadcom/vc5: Fix driver for new PIPE_SHADER_CAP_MAX_HW_ATOMIC_*.

commit | commitdiff | tree

Brian Paul [Fri, 17 Nov 2017 16:38:39 +0000 (09:38 -0700)]

r300: add PIPE_SHADER_CAP_MAX_HW_ATOMIC_COUNTER* switch cases

To silence compiler warnings.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Brian Paul [Fri, 17 Nov 2017 22:03:21 +0000 (15:03 -0700)]

tgsi: s/uint/enum pipe_shader_type/

Roland Scheidegger <sroland@vmware.com>

commit | commitdiff | tree

Brian Paul [Fri, 17 Nov 2017 16:51:10 +0000 (09:51 -0700)]

tgsi: bump tgsi_opcode_info::output_mode size to 4 bits

To avoid problems with MSVC. And verify size with ASSERT_BITFIELD_SIZE().

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>

commit | commitdiff | tree

Kenneth Graunke [Fri, 17 Nov 2017 06:31:27 +0000 (22:31 -0800)]

i965: Revert Gen8 aspect of VF PIPE_CONTROL workaround.

This apparently causes hangs on Broadwell, so let's back it out for now.
I think there are other PIPE_CONTROL workarounds that we're missing.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103787

commit | commitdiff | tree

Adam Jackson [Thu, 16 Nov 2017 18:27:27 +0000 (13:27 -0500)]

egl: Convert int to attrib in eglGetPlatformDisplay

... because converting attrib to int truncates, and that's bad.

Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

commit | commitdiff | tree

Rob Clark [Fri, 17 Nov 2017 20:18:14 +0000 (15:18 -0500)]

docs: update features for freedreno

Just comparing glxinfo and features.txt, and it seems features.txt is
fairly out of date. The a5xx specific features (compute/images/atomics/
etc) are recent.

Signed-off-by: Rob Clark <robdclark@gmail.com>

commit | commitdiff | tree

Matt Turner [Thu, 16 Nov 2017 19:43:51 +0000 (11:43 -0800)]

i965: Rename intel_asm_annotation -> brw_disasm_info

It was the only file named intel_* in the compiler.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Matt Turner [Thu, 16 Nov 2017 01:08:42 +0000 (17:08 -0800)]

i965: Rewrite disassembly annotation code

The old code used an array to store each "instruction group" (the new,
better name than the old overloaded "annotation"), and required a
memmove() to shift elements over in the array when we needed to split a
group so that we could add an error message. This was confusing and
difficult to get right, not the least of which was because the array
has a tail sentinel not included in .ann_count.

Instead use a linked list, a data structure made for efficient
insertion.

Acked-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Matt Turner [Thu, 16 Nov 2017 21:42:41 +0000 (13:42 -0800)]

i965: Simplify annotation_insert_error()

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Matt Turner [Thu, 16 Nov 2017 21:35:01 +0000 (13:35 -0800)]

i965: Move common code out of #ifdef

I'm going to change the call in a later patch and with the difference in
indentation level it wasn't immediately obvious that the calls were
identical.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Anuj Phogat [Tue, 14 Nov 2017 22:48:21 +0000 (14:48 -0800)]

i965: Remove DWord length from MI_FLUSH_DW definition

Fixes: 6165fda59b8 ("i965: Program DWord Length in MI_FLUSH_DW")
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Jason Ekstrand [Sat, 11 Nov 2017 19:52:41 +0000 (11:52 -0800)]

anv/cmd_buffer: Take bo_offset into account in fast clear state addresses

Otherwise, if the image is not bound to the start of the buffer, we're
going to be reading and writing its fast clear state in the wrong spot.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org

commit | commitdiff | tree

Jason Ekstrand [Sun, 12 Nov 2017 06:03:45 +0000 (22:03 -0800)]

anv/cmd_buffer: Advance the address when initializing clear colors

Found by inspection

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: mesa-stable@lists.freedesktop.org

commit | commitdiff | tree

Boyuan Zhang [Tue, 7 Nov 2017 21:25:09 +0000 (16:25 -0500)]

radeon/video: enable encode support for raven

Enable h.264 encode for vcn hardware (raven)

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Boyuan Zhang [Tue, 7 Nov 2017 21:24:10 +0000 (16:24 -0500)]

radeonsi: enable vcn encode

Enable vcn encode by creating radeon_encoder for vcn.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Boyuan Zhang [Wed, 8 Nov 2017 16:24:09 +0000 (11:24 -0500)]

radeon/vcn: add create encoder

Add implementation for create_encoder interface for vcn encode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Boyuan Zhang [Tue, 7 Nov 2017 21:21:21 +0000 (16:21 -0500)]

radeon/vcn: add encode get feedback

Add implementation for get_feedback interface for vcn encode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Boyuan Zhang [Tue, 7 Nov 2017 21:20:53 +0000 (16:20 -0500)]

radeon/vcn: add encode destroy

Add implementation for destroy interface for vcn encode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Boyuan Zhang [Tue, 7 Nov 2017 21:20:25 +0000 (16:20 -0500)]

radeon/vcn: add encode end frame

Add implementation for end_frame interface for vcn encode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Boyuan Zhang [Tue, 7 Nov 2017 21:20:05 +0000 (16:20 -0500)]

radeon/vcn: add encode bitstream

Add implementation for encode_bitstream interface for vcn encode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Boyuan Zhang [Tue, 7 Nov 2017 21:19:22 +0000 (16:19 -0500)]

radeon/vcn: add encode begin frame

Add implementation for begin_frame interface for vcn encode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Boyuan Zhang [Fri, 10 Nov 2017 23:33:32 +0000 (18:33 -0500)]

radeon/vcn: add encode header implementations

Implement encoding of sps, pps, and silce headers using the newly added h.264
header coding descriptors functions based on h.264 specs.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Boyuan Zhang [Tue, 7 Nov 2017 21:17:25 +0000 (16:17 -0500)]

radeon/vcn: add encode header algorithms

Since bitstream headers, e.g. sps, pps, slice, are encoded in driver side, we
need to add corresponding algorithms that required to generate those headers.
According to h.264 specs, signed/unsigned interger Exp-Golomb-coded syntax
element with left bit first (code_se and code_ue) and unsigned integer using
n bits (code_fixed_bits) descriptors function are needed. Therefore, adding
those algorithms and related variables and output algorithms here.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Boyuan Zhang [Tue, 7 Nov 2017 21:09:52 +0000 (16:09 -0500)]

radeon/vcn: add ib implementations

Implement required ibs and command buffer submission interfaces for vcn encode

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Boyuan Zhang [Wed, 8 Nov 2017 16:17:15 +0000 (11:17 -0500)]

radeon/vcn: add common encode part

Add a skeleton pipe video interface and encode ib interface for video encode
on vcn hardware. Add function defines and structures for vcn encode. Update
Makefile.sources and meson.build with newly added files.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Boyuan Zhang [Tue, 7 Nov 2017 20:53:35 +0000 (15:53 -0500)]

st/va: implement poc type

pic_order_cnt_type is a required variable when encoding both sps and
slice header, therefore we need to get this value from st, e.g. vaapi
interface, and then pass it to radeon driver for encoding headers.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>

RSS Atom