git.libre-soc.org Git - mesa.git/log

projects / mesa.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Samuel Pitoiset [Mon, 18 Dec 2017 18:38:58 +0000 (19:38 +0100)]

radv: do not add extra SGPR when push constants are not used

This is not because the vertex stage needs some push constants
that other stages need them too. This should reduce the number
of loaded SGPRs in some situations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Mon, 18 Dec 2017 18:38:57 +0000 (19:38 +0100)]

radv: change the needs_push_constants logic

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Mon, 18 Dec 2017 18:38:56 +0000 (19:38 +0100)]

radv: store pipeline stages that need push constants

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Mon, 18 Dec 2017 18:38:55 +0000 (19:38 +0100)]

radv: remove one useless check in ac_nir_shader_info_pass()

pipeline->layout can't be NULL now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Mon, 18 Dec 2017 18:38:54 +0000 (19:38 +0100)]

radv: remove one useless check in radv_flush_constants()

pipeline->layout can't be NULL now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Mon, 18 Dec 2017 18:38:53 +0000 (19:38 +0100)]

radv: add assertions to make sure pipeline layout objects are valid

The spec requires it.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Mon, 18 Dec 2017 18:38:52 +0000 (19:38 +0100)]

radv: create pipeline layout objects for all meta operations

They are dummy objects but the spec requires layout to not be
NULL, this just makes sure we are creating valid pipeline layout
objects. This will allow us to remove some useless checks.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Bas Nieuwenhuizen [Tue, 19 Dec 2017 08:01:32 +0000 (09:01 +0100)]

radv: Use a sort for rebuilding the sparse buffer bo list.

It uses slightly more memory (though still bounded by the number
of mapped ranges), but gives less quadratic behavior.

Cuts 4 minutes from the runtime of the CTS *.sparse.* tests.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>

commit | commitdiff | tree

Rob Clark [Mon, 18 Dec 2017 20:09:49 +0000 (15:09 -0500)]

freedreno/ir3: handle VTXID_BASE for indirect draws

Need to do some gymnastics to copy the parameter from the indirect
parameters buffer to uniform so shader sees the correct base-vertex-id.

Fixes ./bin/arb_draw_indirect-vertexid on a5xx and probably a4xx too.

Signed-off-by: Rob Clark <robdclark@gmail.com>

commit | commitdiff | tree

Rob Clark [Mon, 18 Dec 2017 20:06:37 +0000 (15:06 -0500)]

freedreno/ir3: add ctx->mem_to_mem()

For dealing with indirect-draw + gl_VertexID, we'll introduce another
case where we need to use CP_MEM_TO_MEM. Rather than adding more
if(a5xx)/else make this a ctx vfunc.

Signed-off-by: Rob Clark <robdclark@gmail.com>

commit | commitdiff | tree

Rob Clark [Mon, 18 Dec 2017 18:34:18 +0000 (13:34 -0500)]

freedreno/a5xx: use vertex_id_zero_base

Cmdstream traces from blob make it clear that the blob driver dev's
*think* a5xx has a real (non-zero-based) vtxid. But reality claims
differently.

Fixes ./bin/gl-3.2-basevertex-vertexid and probably others.

This means draw-indirect is going to need some gymnastics to copy
base-vertex into uniform. (a4xx probably needs that too.)

Signed-off-by: Rob Clark <robdclark@gmail.com>

commit | commitdiff | tree

Dave Airlie [Tue, 19 Dec 2017 05:36:53 +0000 (05:36 +0000)]

r600: clear compressed flags in image state on unbind.

If we aren't binding an image, clear the compressed flags.

This fixes a segfault seen with an apitrace.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104331
Signed-off-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

George Kyriazis [Thu, 14 Dec 2017 18:01:53 +0000 (12:01 -0600)]

swr: Account for index_bias in offsets

When calculating buffer offsets for client buffers account for info.index_bias.

Fixes the follow piglit tests:
arb_draw_elements_base_vertex-drawelements-user_varrays
arb_draw_elements_base_vertex-negative-index-user_varrays

Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Dave Airlie [Mon, 18 Dec 2017 21:38:09 +0000 (21:38 +0000)]

r600: only reported tgsi ir compute support on evergreen+

This fixes a crash on r600/r700.

Signed-off-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Mon, 18 Dec 2017 20:09:19 +0000 (21:09 +0100)]

radv: Advertise sync fd import and export.

Passes dEQP-VK.*.sync_fd.*

Reviewed-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Mon, 18 Dec 2017 20:02:05 +0000 (21:02 +0100)]

radv: Implement sync file import/export for fences & semaphores.

Reviewed-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Mon, 18 Dec 2017 19:33:07 +0000 (20:33 +0100)]

radv/amdgpu: wrap sync fd import/export.

Reviewed-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Dave Airlie [Mon, 18 Dec 2017 06:53:44 +0000 (16:53 +1000)]

ac/nir: fix lds store for patch outputs.

This wasn't calculating the correct value, this along with
a nir patch fixes a regression in:
dEQP-VK.tessellation.shader_input_output.barrier

Fixes: 043d14db30a (ac/nir: don't write tcs outputs to LDS that aren't read back.)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Dave Airlie [Mon, 18 Dec 2017 06:49:43 +0000 (16:49 +1000)]

nir/linking: always set the used_across_stages/outputs_read bits

If we don't remap and output this code would trample the outputs
read bits.

This fixes a regression in
dEQP-VK.tessellation.shader_input_output.barrier

Fixes: 1c9c42d16b4c (nir: add varying component packing helpers)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Jason Ekstrand [Fri, 15 Dec 2017 03:53:05 +0000 (19:53 -0800)]

spirv: Relax the validation conditions of OpSelect

The Talos Principle contains shaders with an OpSelect between two
vectors where the condition is a scalar boolean. This is technically
against the spec bout nir_builder gracefully handles it by splatting
out the condition to all the channels. So long as the condition is a
boolean, just emit a warning instead of failing.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104246

commit | commitdiff | tree

Samuel Pitoiset [Fri, 15 Dec 2017 17:54:00 +0000 (18:54 +0100)]

radv: remove useless radv_cmask_info::base_address_reg

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Fri, 15 Dec 2017 14:37:19 +0000 (15:37 +0100)]

amd/common: add ac_vgt_gs_mode() helper

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Fri, 15 Dec 2017 14:37:18 +0000 (15:37 +0100)]

amd/common: add ac_get_cb_shader_mask() helper

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Fri, 15 Dec 2017 15:01:56 +0000 (16:01 +0100)]

Revert "radv: do not load unused gl_LocalInvocationID/gl_WorkGroupID components"

This reverts commit 2294d35b243dee15af15895e876a63b7d22e48cc.

We can't do this without adjusting the input SGPRs/VGPRs logic.
For now, just revert it. I will send a proper solution later.

It fixes a rendering issue in F1 2017 that CTS didn't catch up.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Dave Airlie [Mon, 18 Dec 2017 05:05:52 +0000 (15:05 +1000)]

radv: port merge tess info from anv

anv merges the tess info correctly, but radv wasn't doing this.

This fixes hangs in
dEQP-VK.tessellation.winding.default_domain.hlsl_triangles_ccw

Fixes: 60fc0544e0 (radv/pipeline: handle tessellation shader compilation)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Mon, 27 Nov 2017 23:28:14 +0000 (00:28 +0100)]

radv: Add external fence support.

Reviewed-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Mon, 27 Nov 2017 23:21:12 +0000 (00:21 +0100)]

radv: Implement VK_KHR_external_fence_fd.

Reviewed-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Mon, 27 Nov 2017 22:58:35 +0000 (23:58 +0100)]

radv: Implement fences based on syncobjs.

Reviewed-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Mon, 27 Nov 2017 00:06:11 +0000 (01:06 +0100)]

amd/common: Add detection of the syncobj wait/signal/reset ioctls.

First amdgpu bump after inclusion was 20 (which was done for local BOs).

Reviewed-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Mon, 27 Nov 2017 00:02:42 +0000 (01:02 +0100)]

radv: Add syncobj signal/reset/wait to winsys.

Reviewed-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Sat, 16 Dec 2017 23:51:58 +0000 (00:51 +0100)]

configure/meson: Bump libdrm_amdgpu version requirement.

For the radv dependencies on syncobj signal/reset.

Reviewed-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Tapani Pälli [Tue, 12 Dec 2017 08:01:57 +0000 (10:01 +0200)]

android: fix vulkan driver build

fixes undefined references by adding missing wsi common API

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

commit | commitdiff | tree

Tapani Pälli [Tue, 12 Dec 2017 08:01:56 +0000 (10:01 +0200)]

android: fix undefined references to futex API

Fixes: f98a2768ca "mesa: Add new fast mtx_t mutex type for basic use cases"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

commit | commitdiff | tree

Dave Airlie [Mon, 18 Dec 2017 04:28:07 +0000 (04:28 +0000)]

docs: mark GL4.3 as finished for r600

Still only on fp64 supported hw.

commit | commitdiff | tree

Dave Airlie [Fri, 3 Nov 2017 01:53:36 +0000 (11:53 +1000)]

r600: export robust buffer access

Signed-off-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Dave Airlie [Fri, 3 Nov 2017 01:52:26 +0000 (11:52 +1000)]

r600: export GLSL 430

Signed-off-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Dave Airlie [Fri, 3 Nov 2017 01:30:12 +0000 (11:30 +1000)]

r600/cs: add compute support to caps

Signed-off-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Dave Airlie [Fri, 24 Nov 2017 00:51:35 +0000 (10:51 +1000)]

r600: always flush between gfx and compute

This is in no way optimal, but there seems to be some problems
mixing at the moment, lots of hangs, it is possible, just need
to figure out more magic.

Signed-off-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Dave Airlie [Mon, 18 Dec 2017 04:29:19 +0000 (04:29 +0000)]

r600: fix unused variable warning

Signed-off-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Sun, 17 Dec 2017 22:53:37 +0000 (23:53 +0100)]

radv: Fix multi-layer blits.

We did not set the layer correctly for the dst, as we would keep
using the base layer. Same for the source image.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102710
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Rob Clark [Thu, 23 Nov 2017 16:58:31 +0000 (11:58 -0500)]

freedreno/a5xx: add a5xx blitter

FD_MESA_DEBUG=noblit to disable

Signed-off-by: Rob Clark <robdclark@gmail.com>

commit | commitdiff | tree

Rob Clark [Wed, 22 Nov 2017 17:37:15 +0000 (12:37 -0500)]

freedreno: add generic blitter

Basically a clone of util_blitter_blit() but with special handling to
blit PIPE_BUFFER as a PIPE_TEXTURE_1D.

Signed-off-by: Rob Clark <robdclark@gmail.com>

commit | commitdiff | tree

Rob Clark [Fri, 24 Nov 2017 15:37:22 +0000 (10:37 -0500)]

freedreno: add non-draw batches for compute/blit

Get rid of "gmem" (ie. tiling) ringbuffer, and just emit setup commands
directly to "draw" ringbuffer for compute (and in future for blits not
using the 3d pipe). This way we can have a simple flat cmdstream buffer
and bypass setup related to 3d pipe.

Signed-off-by: Rob Clark <robdclark@gmail.com>

commit | commitdiff | tree

Rob Clark [Tue, 21 Nov 2017 18:20:53 +0000 (13:20 -0500)]

freedreno: track staging and shadow perf ctrs for the HUD

Signed-off-by: Rob Clark <robdclark@gmail.com>

commit | commitdiff | tree

Rob Clark [Mon, 20 Nov 2017 20:34:40 +0000 (15:34 -0500)]

freedreno: staging upload transfers

In the busy && !needs_flush case, we can support a DISCARD_RANGE upload
using a staging buffer. This is a bit different from the case of mid-
batch uploads which require us to shadow the whole resource (because
later draws in an earlier tile happen before earlier draws in a later
tile).

Signed-off-by: Rob Clark <robdclark@gmail.com>

commit | commitdiff | tree

Rob Clark [Sat, 25 Nov 2017 19:10:34 +0000 (14:10 -0500)]

freedreno: update generated headers

Signed-off-by: Rob Clark <robdclark@gmail.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Sat, 16 Dec 2017 21:02:11 +0000 (22:02 +0100)]

anv: Remove unused variable.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

commit | commitdiff | tree

Marek Olšák [Tue, 12 Dec 2017 21:21:13 +0000 (22:21 +0100)]

radeonsi: don't call force_dcc_off for buffers

This was undefined yet harmless behavior in LLVM.
Not anymore - it causes a hang now.

Cc: 17.3 <mesa-stable@lists.freedesktop.org>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>

commit | commitdiff | tree

Kenneth Graunke [Fri, 15 Dec 2017 00:17:45 +0000 (16:17 -0800)]

isl: Don't require VALIGN_2 for R32G32B32_FLOAT on Haswell.

According to the RENDER_SURFACE_STATE internal documentation, the
R32G32B32_FLOAT restriction is marked "IVB" only. We choose to apply
it to Ivybridge and Baytrail, but not Haswell.

Apparently fixes KHR-GL46.texture_size_promotion.functional on Haswell.

Changes these tests from crashing to skipping on Haswell:
- KHR-GL46.direct_state_access.textures_storage_multisample_2d_rgb32f
- KHR-GL46.direct_state_access.textures_storage_multisample_3d_rgb32f

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

commit | commitdiff | tree

Boyuan Zhang [Fri, 15 Dec 2017 16:23:25 +0000 (11:23 -0500)]

radeon/uvd: add and manage render picture list

Create a list in decoder to store all render picture buffer pointers that
currently being used in reference picture lists.

During get message buffer call, check each pointer in render_pic_list[]
within given pic->ref[] list, remove pointer that no longer being used by
pic->ref[]. Then add current render surface pointer to the render_pic_list[]
and assign the associated index to result.curr_idx.

As a result, result.curr_idx will have the correct index to represent the
current render picture, instead of the previous increamenting values.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Boyuan Zhang [Fri, 15 Dec 2017 16:17:32 +0000 (11:17 -0500)]

radeon/vcn: add and manage render picture list

Create a list in decoder to store all render picture buffer pointers that
currently being used in reference picture lists.

During get message buffer call, check each pointer in render_pic_list[]
within given pic->ref[] list, remove pointer that no longer being used by
pic->ref[]. Then add current render surface pointer to the render_pic_list[]
and assign the associated index to result.curr_idx.

As a result, result.curr_idx will have the correct index to represent the
current render picture, instead of the previous increamenting values.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Boyuan Zhang [Thu, 7 Dec 2017 21:13:51 +0000 (16:13 -0500)]

vl: remove is idr flag

Remove is_idr flag since not being used anymore.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Boyuan Zhang [Fri, 8 Dec 2017 23:22:25 +0000 (18:22 -0500)]

st/va: directly use idr pic flag

Remove is_idr flag, and use idr_pic_flag provided by vaapi directly

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Boyuan Zhang [Thu, 7 Dec 2017 21:10:13 +0000 (16:10 -0500)]

radeon/vce: determine idr by pic type

Vaapi encode interface provides idr frame flags, where omx interface doesn't.
Therefore, change to use picture type to determine idr frame, which will
work for both interfaces.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>

commit | commitdiff | tree

Boyuan Zhang [Thu, 30 Nov 2017 16:58:32 +0000 (11:58 -0500)]

radeon/vcn: determine idr by pic type

Vaapi encode interface provides idr frame flags, where omx interface doesn't.
Therefore, change to use picture type to determine idr frame, which will
work for both interfaces.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Emil Velikov [Thu, 14 Dec 2017 17:20:30 +0000 (17:20 +0000)]

util: scons: wire up the sha1 test

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>

commit | commitdiff | tree

Tim Rowley [Thu, 14 Dec 2017 19:49:56 +0000 (13:49 -0600)]

swr/rast: Move more RTAI handling out of binner

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Thu, 14 Dec 2017 19:39:29 +0000 (13:39 -0600)]

swr/rast: EXTRACT2 changed from vextract/vinsert to vshuffle

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Wed, 13 Dec 2017 23:52:52 +0000 (17:52 -0600)]

swr/rast: Fix cache of API thread event manager

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Tue, 12 Dec 2017 20:23:50 +0000 (14:23 -0600)]

swr/rast: Replace VPSRL with LSHR

Replace use of x86 intrinsic with general llvm IR instruction.

Generates the same final assembly.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Mon, 11 Dec 2017 23:45:58 +0000 (17:45 -0600)]

swr/rast: Rework thread binding parameters for machine partitioning

Add BASE_NUMA_NODE, BASE_CORE, BASE_THREAD parameters to
SwrCreateContext.

Add optional SWR_API_THREADING_INFO parameter to SwrCreateContext to
control reservation of API threads.

Add SwrBindApiThread() function to allow binding of API threads to
reserved HW threads.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Mon, 11 Dec 2017 21:51:46 +0000 (15:51 -0600)]

swr/rast: Pull of RTAI gather & offset out of clip/bin code

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Mon, 11 Dec 2017 14:38:46 +0000 (08:38 -0600)]

swr/rast: Remove no-op VBROADCAST of vID

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Mon, 11 Dec 2017 05:54:30 +0000 (23:54 -0600)]

swr/rast: SIMD16 Fetch - Fully widen 32-bit integer vertex components

Also widen the 16-bit a 8-bit integer vertex component gathers to SIMD16.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Fri, 8 Dec 2017 23:33:23 +0000 (17:33 -0600)]

swr/rast: Replace INSERT2 vextract/vinsert with JOIN2 vshuffle

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Fri, 8 Dec 2017 19:59:19 +0000 (13:59 -0600)]

swr/rast: SIMD16 Fetch - Fully widen 16-bit float vertex components

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Fri, 8 Dec 2017 00:37:07 +0000 (18:37 -0600)]

swr/rast: SIMD16 Fetch - Fully widen 32-bit float vertex components

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Thu, 7 Dec 2017 23:54:40 +0000 (17:54 -0600)]

swr/rast: Pass prim to ClipSimd

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Thu, 7 Dec 2017 17:59:45 +0000 (11:59 -0600)]

swr/rast: Pull most of the VPAI manipulation out of the binner/clipper

Move out of binner/clipper; hand them down from the frontend code instead.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Wed, 6 Dec 2017 18:07:59 +0000 (12:07 -0600)]

swr/rast: Move GatherScissors to header

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Wed, 6 Dec 2017 16:37:41 +0000 (10:37 -0600)]

swr/rast: Rewrite Shuffle8bpcGatherd using shuffle

Ease future code maintenance, prepare for folding simd8 and simd16 versions.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Mon, 4 Dec 2017 21:16:13 +0000 (15:16 -0600)]

swr/rast: Convert gather masks to Nx1bit

Simplifies calling code, gets gather function interface closer to llvm's
masked_gather.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Mon, 4 Dec 2017 00:49:29 +0000 (18:49 -0600)]

swr/rast: WIP - Widen fetch shader to SIMD16

Widen vertex gather/storage to SIMD16 for all component types.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Wed, 29 Nov 2017 21:14:20 +0000 (15:14 -0600)]

swr/rast: Corrections to multi-scissor handling

binner's GatherScissors() will be turned into a real gather in the not
too distant future.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Wed, 29 Nov 2017 16:46:49 +0000 (10:46 -0600)]

swr/rast: Binner fixes for viewport index offset handling

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Tim Rowley [Tue, 21 Nov 2017 17:05:08 +0000 (11:05 -0600)]

swr/rast: Remove unneeded copy of gather mask

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

commit | commitdiff | tree

Chris Wilson [Thu, 23 Nov 2017 09:57:08 +0000 (09:57 +0000)]

i965: Allow old begin/end queryobj for gen4/5 with HW contexts

Since we have HW contexts on gen4/5, we could take advantage of them, as
done for gen6+ in commit e32cd5ffbb72 ("i965: Rely on hardware contexts
for query objects on Gen6+."), to only emit a pair of counters at
begin/end queryobj, rather than around every primitive. However, to keep
queryobj working in the meantime as we bringup support for HW ctx on
gen4/5, we can keep using the existing code.

References: e32cd5ffbb72 ("i965: Rely on hardware contexts for query objects on Gen6+.")
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

commit | commitdiff | tree

Rob Clark [Mon, 4 Dec 2017 14:15:27 +0000 (09:15 -0500)]

freedreno: use u_transfer_helper

Signed-off-by: Rob Clark <robdclark@gmail.com>

commit | commitdiff | tree

Rob Clark [Tue, 28 Nov 2017 15:47:06 +0000 (10:47 -0500)]

gallium/util: add u_transfer_helper

Add a new helper that drivers can use to emulate various things that
need special handling in particular in transfer_map:

1) z32_s8x24.. gl/gallium treats this as a single buffer with depth
    and stencil interleaved but hardware frequently treats this as
    separate z32 and s8 buffers.  Special pack/unpack handling is
    needed in transfer_map/unmap to pack/unpack the exposed buffer

2) fake RGTC.. GPUs designed with GLES in mind, but which can other-
    wise do GL3, if native RGTC is not supported it can be emulated
    by converting to uncompressed internally, but needs pack/unpack
    in transfer_map/unmap

3) MSAA resolves in the transfer_map() case

v2: add MSAA resolve based on Eric's "gallium: Add helpers for MSAA
    resolves in pipe_transfer_map()/unmap()." patch; avoid wrapping
    pipe_resource, to make it possible for drivers to use both this
    and threaded_context.

Signed-off-by: Rob Clark <robdclark@gmail.com>

commit | commitdiff | tree

Tapani Pälli [Thu, 14 Dec 2017 11:53:10 +0000 (13:53 +0200)]

i965: enable EXT_disjoint_timer_query extension

Following dEQP cases pass:
dEQP-EGL.functional.get_proc_address.extension.gl_ext_disjoint_timer_query
dEQP-EGL.functional.client_extensions.disjoint

Piglit test 'ext_disjoint_timer_query-simple' passes with these changes.

No changes/regression observed in Intel CI.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

commit | commitdiff | tree

Tapani Pälli [Tue, 12 Dec 2017 12:46:13 +0000 (14:46 +0200)]

mesa: GL_EXT_disjoint_timer_query extension API bits

Patch adds GL_GPU_DISJOINT_EXT and enables to use timer queries when
EXT_disjoint_timer_query is enabled.

v2: enable extension only when EXT_disjoint_timer_query set

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

commit | commitdiff | tree

Tapani Pälli [Mon, 20 Nov 2017 06:36:52 +0000 (08:36 +0200)]

glapi: add GL_EXT_disjoint_timer_query

Most entrypoints already available via other extensions like
GL_EXT_occlusion_query_boolean, GL_EXT_timer_query.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

commit | commitdiff | tree

Tapani Pälli [Mon, 20 Nov 2017 06:31:40 +0000 (08:31 +0200)]

mesa: add DisjointOperation to gl_shared_state

This state will be used by EXT_disjoint_timer_query. As first
usage, patch sets DisjointOperation true when gpu reset happens.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

commit | commitdiff | tree

Eric Anholt [Thu, 14 Dec 2017 17:41:16 +0000 (09:41 -0800)]

broadcom/vc5: Fix a typo in memcmp for sig unpack checking.

This shockingly ended up working out, because only the first byte of *sig
is used and (sizeof(*sig) != 0) == 1. Fixes a compiler warning.

Link: https://bugs.freedesktop.org/show_bug.cgi?id=104183

commit | commitdiff | tree

Eric Anholt [Wed, 22 Nov 2017 00:33:29 +0000 (16:33 -0800)]

broadcom/vc5: Enable NIR txd lowering on all txd instructions.

Fixes almost all of piglit's arb_shader_texture_lod grad tests, except for
the base -texgrad/texgradcube ones which fail on what appear to be
precision problems.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

commit | commitdiff | tree

Eric Anholt [Wed, 22 Nov 2017 00:21:36 +0000 (16:21 -0800)]

nir: Add a new lowering option to lower all txd to txl.

VC5 requires that all txd are lowered in the shader.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

commit | commitdiff | tree

Eric Anholt [Tue, 21 Nov 2017 21:42:08 +0000 (13:42 -0800)]

nir: Fix interaction of GL_CLAMP lowering with texture offsets.

We want the clamping of the coordinate to apply after the offset, so we
need to do math to lower the offset out of the instruction. Fixes texwrap
offset cases for GL_CLAMP with GL_NEAREST on vc5.

Note: I moved the get_texture_size() verbatim, so that it was defined
before use.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

commit | commitdiff | tree

Eric Anholt [Wed, 6 Dec 2017 19:30:02 +0000 (11:30 -0800)]

broadcom/vc5: Fix shader input/outputs for gallium's new NIR linking.

commit | commitdiff | tree

Roland Scheidegger [Wed, 13 Dec 2017 02:33:07 +0000 (03:33 +0100)]

gallivm: implement accurate corner behavior for textureGather with cube maps

The spec says the missing texel (when we wrap around both x and y axis)
should be synthesized as the average of the 3 other texels. For bilinear
filtering however we instead adjusted the filter weights (because, while
the complexity looks similar, there would be 4 times as many color values
to fix up than weights). Obviously this could not work for gather (hence
accurate corner filtering was disabled with gather).
Implement this by just doing it as the spec implies - calculate the 4th
texel as the average of the other 3. With gather of course there's only
one color to worry about, so it's not all that many instructions neither
(albeit surely the whole cube map filtering is hilariously complex).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

commit | commitdiff | tree

Roland Scheidegger [Wed, 13 Dec 2017 02:33:21 +0000 (03:33 +0100)]

gallivm: fix an issue with NaNs with seamless cube filtering

Cube texture wrapping is a bit special since the values (post face
projection) always are within [0,1], so we took advantage of that and
omitted some clamps.
However, we can still get NaNs (either because the coords already had NaNs,
or the face projection generated them), and in fact we didn't handle them
quite safely. I've seen -INT_MAX + 1 been propagated through as the final int
coord value, albeit I didn't observe a crash. (Not quite a coincidence, since
any stride mul with -INT_MAX or -INT_MAX+1 will turn up as a small positive
number - nevertheless, I'd rather not try my luck, I'm not entirely sure it
can't really turn up negative neither due to seamless coord swapping, plus
ifloor of a NaN is not guaranteed to return -INT_MAX by any standard. And
we kill off NaNs similarly with ordinary texture wrapping too.)
So kill off the NaNs by using the common max against zero method.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 13 Dec 2017 19:51:01 +0000 (11:51 -0800)]

intel/tools: Convert aubinator over to the common framework

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 13 Dec 2017 18:17:40 +0000 (10:17 -0800)]

intel/batch-decoder: Decode registers

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 13 Dec 2017 18:16:46 +0000 (10:16 -0800)]

intel/batch-decoder: Decode dynamic state

Unfortunately, in aubinator and aubinator_error_decode we don't always
know how many of a given state we have, so we must guess. One day,
we'll come up with a way to annotate the batch to solve this problem.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 13 Dec 2017 17:58:27 +0000 (09:58 -0800)]

intel/batch-decoder: Decode constants, binding tables, and samplers

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 13 Dec 2017 19:03:32 +0000 (11:03 -0800)]

intel/tools: Switch aubinator_error_decode over to the gen_print_batch

The shared framework can now do everything that aubinator_error_decode
ever did and more. It's time to make the switch.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 13 Dec 2017 17:46:39 +0000 (09:46 -0800)]

intel/batch-decoder: Decode graphics shaders

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 13 Dec 2017 17:19:57 +0000 (09:19 -0800)]

intel/batch-decoder: Decode vertex and index buffers

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 13 Dec 2017 16:01:03 +0000 (08:01 -0800)]

intel/batch-decoder: Decode MEDIA_INTERFACE_DESCRIPTOR_LOAD

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 13 Dec 2017 08:10:12 +0000 (00:10 -0800)]

intel/tools: Add the start of a generic batch decoder

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Wed, 13 Dec 2017 16:23:50 +0000 (08:23 -0800)]

intel/decoder: Expose the raw field value in the iterator

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>