Daniel Stone [Fri, 9 Feb 2018 23:43:25 +0000 (15:43 -0800)]
vulkan/wsi: Add multiple planes to wsi_image
Not currently used.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Timothy Arceri [Wed, 21 Feb 2018 03:36:09 +0000 (14:36 +1100)]
nir: remove old assert
This was originally intended to make sure the remap location
was not -1. However the code has changed alot since then,
the location is now never set to -1 and we also handle
components meaning this old assert has been doing comparisions
with the pointer to the array of component data.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105183
Timothy Arceri [Wed, 21 Feb 2018 02:27:17 +0000 (13:27 +1100)]
radeonsi/nir: collect more accurate output_usagemask
Fixes assert in the glsl-1.50-gs-max-output-components piglit test.
Note that the double handling will only work for doubles that
don't take up multiple slots i.e. double and dvec2. However
dual slot double handling is an existing bug which is made no
worse by this patch.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Wed, 21 Feb 2018 01:30:30 +0000 (12:30 +1100)]
radeonsi/nir: disable GLSL IR loop unrolling
Delaying unrolling and allowing NIR to do it instead has been shown
to result in better code in drivers such as i965. shader-db results
appear to show the same is true for radeonsi.
The other advantage is that using NIR unrolling improves compile
times significantly.
Totals from affected shaders:
SGPRS: 9624 -> 10016 (4.07 %)
VGPRS: 6800 -> 6464 (-4.94 %)
Spilled SGPRs: 0 -> 2 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 359176 -> 332264 (-7.49 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 1355 -> 1432 (5.68 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Tue, 20 Feb 2018 23:10:33 +0000 (10:10 +1100)]
radeonsi/nir: fix tess varying loads for doubles
Fixes the following piglit tests:
tests/spec/arb_tessellation_shader/execution/double-array-vs-tcs-tes.shader_test
tests/spec/arb_tessellation_shader/execution/double-vs-tcs-tes.shader_test
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Tue, 20 Feb 2018 23:09:18 +0000 (10:09 +1100)]
ac/radeonsi: pass type to load_tess_varyings()
We need this to be able to load 64bit varyings.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Daniel Stone [Wed, 21 Feb 2018 10:39:34 +0000 (10:39 +0000)]
x11/dri3: Store raw present completion mode
The DRI3 drawable info struct currently stores a boolean for whether the
last completed operation was a flip or not. As we need to track the full
completion mode for handling suboptimal returns, change the 'flipping'
field to the raw present completion mode from the server.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Daniel Stone [Wed, 21 Feb 2018 11:39:09 +0000 (11:39 +0000)]
x11/dri3: Don't open-code ARRAY_SIZE
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Jason Ekstrand [Wed, 21 Feb 2018 21:07:10 +0000 (13:07 -0800)]
anv: Don't assert that stencil HiZ clears are single-slice
It's true for depth HiZ clears because we only have HiZ on single-slice
images right now. However, for stencil-only clears there is no such
restriction.
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Jason Ekstrand [Sun, 11 Feb 2018 06:10:03 +0000 (22:10 -0800)]
anv: Only copy clear dwords if we're rendering to the first slice
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Marek Olšák [Tue, 30 Jan 2018 01:51:47 +0000 (02:51 +0100)]
radeonsi: don't flush when si_eliminate_fast_color_clear is no-op
Marek Olšák [Sun, 7 Jan 2018 20:04:55 +0000 (21:04 +0100)]
radeonsi: make texture_discard_cmask/eliminate functions non-static
James Zhu [Mon, 5 Feb 2018 17:02:50 +0000 (12:02 -0500)]
radeonsi: enable uvd encode for HEVC main
Enable UVD encode for HEVC main profile
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
James Zhu [Mon, 5 Feb 2018 22:08:22 +0000 (17:08 -0500)]
radeonsi:create uvd hevc enc entry
Add UVD hevc encode pipe video codec creation entry
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
James Zhu [Tue, 6 Feb 2018 18:29:11 +0000 (13:29 -0500)]
radeon/uvd:add uvd hevc enc functions
Implement UVD hevc encode functions
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
James Zhu [Tue, 6 Feb 2018 18:26:28 +0000 (13:26 -0500)]
radeon/uvd:add uvd hevc enc hw ib implementation
Implement required IBs for UVD HEVC encode.
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
James Zhu [Tue, 6 Feb 2018 18:18:21 +0000 (13:18 -0500)]
radeon/uvd:add uvd hevc enc hw interface header
Add hevc encode hardware interface for UVD
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
James Zhu [Tue, 6 Feb 2018 17:39:03 +0000 (12:39 -0500)]
winsys/amdgpu:add uvd hevc enc support in amdgpu cs
Support UVD HEVC encode in amdgpu cs
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
James Zhu [Mon, 5 Feb 2018 21:28:13 +0000 (16:28 -0500)]
amd/common:add uvd hevc enc support check in hw query
Based on amdgpu hardware query information to check if UVD hevc enc support
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Karol Herbst [Mon, 19 Feb 2018 23:45:14 +0000 (00:45 +0100)]
nvir/nvc0: fix legalizing of ld unlock c0[0x10000]
We have to increase the file index also for 0x10000 not just for values
greater than 0x10000.
Fixes: 37b67db6ae34fb6586d640a7a1b6232f091dd812
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Samuel Pitoiset [Tue, 20 Feb 2018 10:11:43 +0000 (11:11 +0100)]
ac/nir: add glsl_is_array_image() helper
For consistency.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Tue, 20 Feb 2018 10:11:42 +0000 (11:11 +0100)]
ac/nir: set the DA field when performing atomics on 3D images
This doesn't fix anything known but it should definitely be set.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Eric Anholt [Sat, 10 Feb 2018 11:19:00 +0000 (11:19 +0000)]
i965: Fix compiler warning about write being undefined.
This looks like it should be protected by the assume() about
nr_color_regions, but my compiler warns anyway.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Eric Anholt [Sat, 10 Feb 2018 11:03:38 +0000 (11:03 +0000)]
glsl/tests: Fix a compiler warning about signed/unsigned loop comparison.
Fixes: d32956935edf ("glsl: Walk a list of ir_dereference_array to mark array elements as accessed")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Eric Anholt [Sat, 10 Feb 2018 10:45:18 +0000 (10:45 +0000)]
loader: Fix compiler warnings about truncating the PCI ID path.
My build was producing:
../src/loader/loader.c:121:67: warning: ‘%1u’ directive output may be truncated writing between 1 and 3 bytes into a region of size 2 [-Wformat-truncation=]
and we can avoid this careful calculation by just using asprintf (as we do
elsewhere in the file).
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Eric Anholt [Sat, 10 Feb 2018 10:41:07 +0000 (10:41 +0000)]
glsl: Silence warnings in the uniform initializer test about 16-bit types
They should probably get unit tests implemented, but this cleans up a
bunch of warnings in my build for now.
Fixes: 59f458cd8703 ("glsl: Add 16-bit types")
Cc: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Jordan Justen [Wed, 8 Nov 2017 23:42:14 +0000 (15:42 -0800)]
i965: Enable disk shader cache by default
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Dave Airlie [Mon, 19 Feb 2018 04:59:53 +0000 (04:59 +0000)]
radv: don't send num_tcs_input_cp to sgprs.
We never use it in the shaders.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Mon, 19 Feb 2018 04:55:52 +0000 (04:55 +0000)]
radv/tess: don't need to look in constant for vertices_per_patch
This just avoids passing this value via user sgprs.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Mon, 19 Feb 2018 06:19:07 +0000 (06:19 +0000)]
ac/radv: cleanup some tcs output values access
Just consolidates some code to make it easier to change.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Mon, 19 Feb 2018 06:53:21 +0000 (06:53 +0000)]
ac/radv: remove total_vertices variable
This just removes an unneeded variable.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Mon, 19 Feb 2018 20:33:17 +0000 (20:33 +0000)]
ac/radv: don't mark tess inner as used if we don't use it.
This just avoids marking it as a used output if we don't
actually use it.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Tue, 20 Feb 2018 00:15:18 +0000 (10:15 +1000)]
ac/nir: to integer the args to bcsel.
dEQP-VK.tessellation.invariance.outer_edge_symmetry.triangles_equal_spacing_ccw
was hitting an llvm assert due to one value being an int and the
other a float.
This just casts both values to integer and fixes the test.
Fixes: dEQP-VK.tessellation.invariance.outer_edge_symmetry.triangles_equal_spacing_ccw
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Jason Ekstrand [Fri, 2 Feb 2018 22:51:56 +0000 (14:51 -0800)]
anv/blorp: Use layout_to_aux_usage when a layout is provided
Instead of having aux usage and ANV_AUX_USAGE_DEFAULT to mean "give me
something reasonable" we now use anv_layout_to_aux_usage whenever a
layout is available. If a layout is available, we ignore the aux_usage
parameter. For the cases where we have an explicit aux usage such as
clears and aux ops, we have a new ANV_IMAGE_LAYOUT_EXPLICIT_AUX layout.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Fri, 2 Feb 2018 04:02:48 +0000 (20:02 -0800)]
anv/cmd_buffer: Delete some assert-only variables
Checking the sample count is almost as good as aux usage in this case.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Fri, 2 Feb 2018 03:36:22 +0000 (19:36 -0800)]
anv/cmd_buffer: Use layout_to_* helpers in compute_aux_usage
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Fri, 2 Feb 2018 03:13:12 +0000 (19:13 -0800)]
anv/cmd_buffer: Simplify transition_depth_buffer
If we don't have HiZ, then anv_layout_to_aux_usage will return NONE for
both layouts. If the two layouts are the same, they will get the aux
usage. In either case, the code below will give us ISL_AUX_OP_NONE and
we'll return without doing anything.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Tue, 21 Nov 2017 23:56:35 +0000 (15:56 -0800)]
anv/cmd_buffer: Do subpass image transitions in begin/end_subpass
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Jason Ekstrand [Sat, 13 Jan 2018 18:59:05 +0000 (10:59 -0800)]
anv/cmd_buffer: Mark depth/stencil surfaces written in begin_subpass
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Jason Ekstrand [Tue, 21 Nov 2017 23:16:40 +0000 (15:16 -0800)]
anv/cmd_buffer: Sync clear values in begin_subpass
This is quite a bit cleaner because we now sync the clear values at the
same time as we do the fast clear. For loading the clear values into
the surface state, we now do it once when we handle the LOAD_OP_LOAD
instead of every subpass.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Jason Ekstrand [Sat, 13 Jan 2018 18:45:55 +0000 (10:45 -0800)]
anv/pass: Store usage in each subpass attachment
This requires us to ditch the VkAttachmentReference struct in favor of
an anv-specific struct. However, we can now easily identify from just
the subpass attachment what kind of an attachment it is. This will make
iteration over anv_subpass::attachments a little easier in some case.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Jason Ekstrand [Wed, 22 Nov 2017 04:29:36 +0000 (20:29 -0800)]
anv/cmd_buffer: Add a concept of pending load aspects
These are the same as pending clear aspects only for the "load"
operation.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Jason Ekstrand [Mon, 27 Nov 2017 18:43:03 +0000 (10:43 -0800)]
anv/cmd_buffer: Iterate all subpass attachments when clearing
This unifies things a bit because we now handle depth and stencil at the
same time.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Jason Ekstrand [Mon, 27 Nov 2017 18:20:00 +0000 (10:20 -0800)]
anv/cmd_buffer: Decide whether or not to HiZ clear up-front
This moves the decision out of begin_subpass and into BeginRenderPass
like the decision for color clears. We use a similar name for the
function for depth/stencil as for color even though no aux usage is
really getting computed.
v2 (Jason Ekstrand):
- Don't always disable HiZ clears by accident
- Use the initial layout to decide whether to do fast clears
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Jason Ekstrand [Tue, 21 Nov 2017 22:46:25 +0000 (14:46 -0800)]
anv/cmd_buffer: Move the rest of clear_subpass into begin_subpass
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Jason Ekstrand [Tue, 21 Nov 2017 22:00:44 +0000 (14:00 -0800)]
intel/blorp: Add a blorp_hiz_clear_depth_stencil helper
This is similar to blorp_gen8_hiz_clear_attachments except that it takes
actual images instead of trusting in the already set depth state.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Jason Ekstrand [Tue, 21 Nov 2017 21:30:49 +0000 (13:30 -0800)]
anv/cmd_buffer: Move the color portion of clear_subpass into begin_subpass
This doesn't really change much now but it will give us more/better
control over clears in the future. The one interesting functional
change here is that we are now re-emitting 3DSTATE_DEPTH_BUFFERS and
friends for each clear. However, this only happens at begin_subpass
time so it shouldn't be substantially more expensive.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Jason Ekstrand [Tue, 21 Nov 2017 20:42:45 +0000 (12:42 -0800)]
anv/cmd_buffer: Pass a subpass id into begin_subpass
This is a bit less awkward than passing in the subpass because it means
we don't have to extract the subpass id from the subpass.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Jason Ekstrand [Tue, 21 Nov 2017 20:41:01 +0000 (12:41 -0800)]
anv/cmd_buffer: Add begin/end_subpass helpers
Having begin/end_subpass is a bit nicer than the begin/next/end hooks
that Vulkan gives us.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Jason Ekstrand [Tue, 21 Nov 2017 20:27:43 +0000 (12:27 -0800)]
anv/cmd_buffer: Apply subpass flushes before set_subpass
This seems slightly more correct because it means that the flushes
happen before any clears or resolves implied by the subpass transition.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Jason Ekstrand [Fri, 9 Feb 2018 00:44:56 +0000 (16:44 -0800)]
anv: Use framebuffer layers for implicit subpass transitions
Fixes: de3be618016 "anv/cmd_buffer: Rework aux tracking"
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Jason Ekstrand [Tue, 13 Feb 2018 00:03:28 +0000 (16:03 -0800)]
anv: Be more careful about fast-clear colors
Previously, we just used all the channels regardless of the format.
This is less than ideal because some channels may have undefined values
and this should be ok from the client's perspective. Even though the
driver should do the correct thing regardless of what is in the
undefined value, it makes things less deterministic. In particular, the
driver may choose to fast-clear or not based on undefined values. This
level of nondeterminism is bad.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Jason Ekstrand [Mon, 12 Feb 2018 23:50:12 +0000 (15:50 -0800)]
intel/isl: Add an isl_color_value_is_zero helper
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Jason Ekstrand [Sat, 17 Feb 2018 01:35:15 +0000 (17:35 -0800)]
anv/gpu_memcpy: CS Stall before a MI memcpy on gen7
This fixes a pile of hangs caused by the recent shuffling of resolves
and transitions. The particularly problematic case is when you have at
least three attachments with load ops of CLEAR, LOAD, CLEAR. In this
case, we execute the first CLEAR followed by a MI memcpy to copy the
clear values over for the LOAD followed by a second CLEAR. The MI
commands cause the first CLEAR to hang which causes us to get stuck on
the 3DSTATE_MULTISAMPLE in the second CLEAR.
We also add guards for BLORP to fix the same issue. These shouldn't
actually do anything right now because the only use of indirect clears
in BLORP today is for resolves which are already guarded by a render
cache flush and CS stall. However, this will guard us against potential
issues in the future.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Nanley Chery <nanley.g.chery@intel.com>
Guillaume Charifi [Tue, 20 Feb 2018 11:49:28 +0000 (12:49 +0100)]
st/mesa: Factorize duplicate code for atomic buffer binding
Signed-off-by: Guillaume Charifi <guillaume.charifi@sfr.fr>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Guillaume Charifi [Fri, 5 Jan 2018 16:49:39 +0000 (17:49 +0100)]
st/mesa: Factorize duplicate code in st_update_framebuffer_state()
Signed-off-by: Guillaume Charifi <guillaume.charifi@sfr.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Rob Clark [Tue, 20 Feb 2018 18:40:46 +0000 (13:40 -0500)]
freedreno/ir3: fix use_count refcnt'ing issue
Was hitting an assert with vs-varying-array-mat4-index-col-row-wr.shader_test
When eliminating a copy, we were dropping the use_count of the mov that
is skipped, but not increasing the use_count of it's src instruction.
Fixes: 76440fcca91 freedreno/ir3: clean up dangling false-dep's
Signed-off-by: Rob Clark <robdclark@gmail.com>
Eric Engestrom [Tue, 20 Feb 2018 13:35:56 +0000 (13:35 +0000)]
docs: fix patent url
Reported-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Brian Paul [Fri, 16 Feb 2018 20:57:51 +0000 (13:57 -0700)]
svga: replaced 'unsigned' with proper enum types in shader code
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Jonathan Gray [Tue, 20 Feb 2018 06:38:00 +0000 (17:38 +1100)]
configure.ac: pthread-stubs not present on OpenBSD
pthread-stubs is no longer required on OpenBSD and has been removed.
libpthread parts involved moved to libc.
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Andres Gomez [Tue, 13 Feb 2018 22:42:57 +0000 (00:42 +0200)]
swr: bump minimum supported LLVM version to 4.0
Since radv and radeonsi removed support for LLVM 3.9 the distcheck
target got broken because SWR distribution needed 3.9.x.
After checking with George Kyriazis, SWR is OK with moving to LLVM 4.0
and above, which will solve this problem.
Fixes: 3bf1e036e8a ("amd: remove support for LLVM 3.9")
Cc: George Kyriazis <george.kyriazis@intel.com>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
Andres Gomez [Tue, 6 Feb 2018 15:42:42 +0000 (17:42 +0200)]
travis: radeonsi and radv need LLVM 4.0
Fixes: 3bf1e036e8a ("amd: remove support for LLVM 3.9")
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Jan Vesely <jan.vesely@rutgers.edu>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Samuel Pitoiset [Fri, 16 Feb 2018 09:33:10 +0000 (10:33 +0100)]
ac/nir: move ac_declare_lds_as_pointer() outside of the switch
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Fri, 16 Feb 2018 10:00:14 +0000 (11:00 +0100)]
radv: allow to force family using RADV_FORCE_FAMILY
Useful for pipeline-db.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Thomas Hellstrom [Fri, 9 Feb 2018 08:37:19 +0000 (09:37 +0100)]
loader_dri3/glx/egl: Reinstate the loader_dri3_vtable get_dri_screen callback
Removing this callback caused rendering corruption in some multi-screen cases,
so it is reinstated but without the drawable argument which was never used
by implementations and was confusing since the drawable could have been
created with another screen.
Cc: "17.3 18.0" mesa-stable@lists.freedesktop.org
Fixes: 5198e48a0d (loader_dri3/glx/egl: Remove the loader_dri3_vtable get_dri_screen callback)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105013
Reported-by: Daniel van Vugt <daniel.van.vugt@canonical.com>
Tested-by: Timo Aaltonen <tjaalton@ubuntu.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Thomas Hellstrom [Mon, 15 Jan 2018 11:51:27 +0000 (12:51 +0100)]
svga: Fix a leftover debug hack
Fix what appears to be a leftover debug hack.
The hack would force the driver to take a different blit path; possibly,
although unverified, reverting to software blits.
Tested using piglit tests/quick. No related regressions.
Cc: "17.2 17.3 18.0" <mesa-stable@lists.freedesktop.org>
Fixes: 9d81ab7376 (svga: Relax the format checks for copy_region_vgpu10 somewhat)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104625
Reported-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Iago Toral Quiroga [Wed, 7 Feb 2018 08:21:47 +0000 (09:21 +0100)]
anv/entrypoints: make vkGetDeviceProcAddr return NULL for instance commands
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Ilia Mirkin [Sun, 31 Dec 2017 07:39:11 +0000 (02:39 -0500)]
nv50,nvc0: mark ABGR format as displayable instead of ARGB format
This matches the hardware's capabilities.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Sun, 31 Dec 2017 07:36:39 +0000 (02:36 -0500)]
st/dri: only expose config formats that are display targets
In the case of NVIDIA hardware, ABGR is displayable but ARGB is not.
Only advertise the one set in the visuals list.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Ilia Mirkin [Sun, 31 Dec 2017 06:05:06 +0000 (01:05 -0500)]
mesa: add xbgr support adjacent to xrgb
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Daniel Stone <daniels@collabora.com>
Timothy Arceri [Fri, 16 Feb 2018 00:41:17 +0000 (11:41 +1100)]
st/shader_cache: copy nir pointer to gl_program after deserializing
This fixes a crash when running the arb_get_program_binary-api-errors
piglit test twice.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Thu, 15 Feb 2018 23:14:05 +0000 (10:14 +1100)]
radeonsi: add nir shader cache support
In future we might want to try avoid calling nir_serialize() but
this works for now.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Thu, 15 Feb 2018 05:58:07 +0000 (16:58 +1100)]
radeonsi: rename variables tgsi_binary -> ir_binary
This better represents that the ir could be either tgsi or nir.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Emil Velikov [Mon, 19 Feb 2018 22:10:18 +0000 (22:10 +0000)]
docs: update calendar, add news and link release notes to 17.3.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Mon, 19 Feb 2018 22:07:23 +0000 (22:07 +0000)]
docs: add sha256 checksums for 17.3.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
164a993112cc7278d46b7ec8f7f617eb683b212c)
Emil Velikov [Mon, 19 Feb 2018 22:01:35 +0000 (22:01 +0000)]
docs: add release notes for 17.3.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
2529d77179065b983d69c620c7f71281aefe4f98)
Marek Olšák [Mon, 19 Feb 2018 16:55:34 +0000 (17:55 +0100)]
radeonsi: fix regression from 32-bit pointers on CI
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Samuel Pitoiset [Fri, 16 Feb 2018 09:28:37 +0000 (10:28 +0100)]
radv: compact varyings after removing unused ones
It makes no sense to compact before, and the description of
nir_compact_varyings() confirms that.
Polaris10:
Totals from affected shaders:
SGPRS: 108528 -> 108128 (-0.37 %)
VGPRS: 74548 -> 74500 (-0.06 %)
Spilled SGPRs: 844 -> 814 (-3.55 %)
Code Size:
3007328 ->
2992932 (-0.48 %) bytes
Max Waves: 16019 -> 16009 (-0.06 %)
Vega10:
Totals from affected shaders:
SGPRS: 106088 -> 106232 (0.14 %)
VGPRS: 74652 -> 74700 (0.06 %)
Spilled SGPRs: 692 -> 658 (-4.91 %)
Code Size:
2967708 ->
2953028 (-0.49 %) bytes
Max Waves: 18178 -> 18162 (-0.09 %)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Timothy Arceri [Fri, 16 Feb 2018 05:14:29 +0000 (16:14 +1100)]
radeonsi/nir: fix gl_FragCoord for pixel_center_integer
Fixes piglit test glsl-arb-fragment-coord-conventions
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Timothy Arceri [Fri, 16 Feb 2018 05:10:58 +0000 (16:10 +1100)]
glsl/nir: add pixel_center_integer to shader info
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Ilia Mirkin [Sat, 10 Feb 2018 18:39:56 +0000 (13:39 -0500)]
gm107/ir: avoid using kepler instruction capabilities
Split up the op properties table into generation-specific bits, and only
use the kepler ones on kepler. Fixes some CTS images tests.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Ilia Mirkin [Sat, 13 Jan 2018 17:32:41 +0000 (12:32 -0500)]
nvc0: add support for bindless on maxwell+
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Sat, 13 Jan 2018 17:28:16 +0000 (12:28 -0500)]
gm107/ir: change how SUQ works in preparation for bindless
All this information can be retrieved from the TIC directly. Avoid
having to dip into the constbuf information about the image.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Kenneth Graunke [Fri, 15 Aug 2014 05:36:45 +0000 (22:36 -0700)]
i965: Use absolute addressing for constant buffer 0 on Kernel 4.16+.
By default, 3DSTATE_CONSTANT_* Constant Buffer 0 is relative to dynamic
state base address. This makes it unusable for pushing UBOs.
There is a bit in the INSTPM register (or CS_DEBUG_MODE2 on Skylake)
which controls whether buffer 0 is relative to dynamic state base
address, or simply a normal pointer. Setting that gives us full
flexibility. This lets us push up to 4 UBO ranges.
We can't currently write this on Haswell and earlier, and will need
to update the kernel command parser, and then do the whole version
checking song and dance. We also need a brand new kernel that supports
context isolation - on older kernels, newly created contexts inherit
register state from whatever happened to be running. So, setting this
would have catastrophic impact on other drivers such as libva, Beignet,
or older Mesa.
See commit
8ec5a4e4a4a32f4de351c5fc2bf0eb615b6eef1b where we did this
once before, but had to revert it in commit
013d33122028f2492da90a03a.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Kenneth Graunke [Fri, 16 Feb 2018 19:03:58 +0000 (11:03 -0800)]
i965: Stop restoring the default L3 configuration on Kernel 4.16+.
Kernel 4.16 has proper context isolation, which means we can change
the L3 configuration without worrying about that leaking to other
newly created contexts, breaking the assumptions of other userspace.
So, disable our workaround to reprogram it back to the default.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Mikko Perttunen [Thu, 15 Feb 2018 18:13:20 +0000 (20:13 +0200)]
nvc0: Use GP100_COMPUTE_CLASS on GP10B
GP10B requires the use of GP100_COMPUTE_CLASS instead of
GP104_COMPUTE_CLASS as is used for other non-GP100 chips.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Daniel Stone [Thu, 15 Feb 2018 15:04:51 +0000 (15:04 +0000)]
i965: Fix aux-surface size check
The previous commit reworked the checks intel_from_planar() to check the
right individual cases for regular/planar/aux buffers, and do size
checks in all cases.
Unfortunately, the aux size check was broken, and required the aux
surface to be allocated with the correct aux stride, but full image
height (!).
As the ISL aux surface is not recorded in the DRIimage, we cannot easily
access it to check. Instead, store the aux size from when we do have the
ISL surface to hand, and check against that later when we go to access
the aux surface.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: c2c4e5bae3ba ("i965: Fix bugs in intel_from_planar")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Marek Olšák [Mon, 1 Jan 2018 20:04:22 +0000 (21:04 +0100)]
radeonsi: implement 32-bit pointers in user data SGPRs (v2)
User SGPRs changes:
VS: 14 -> 9
TCS: 14 -> 10
TES: 10 -> 6
GS: 8 -> 4
GSCOPY: 2 -> 1
PS: 9 -> 5
Merged VS-TCS: 24 -> 16
Merged VS-GS: 18 -> 11
Merged TES-GS: 18 -> 11
SGPRS:
2170102 ->
2158430 (-0.54 %)
VGPRS:
1645656 ->
1641516 (-0.25 %)
Spilled SGPRs: 9078 -> 8810 (-2.95 %)
Spilled VGPRs: 130 -> 114 (-12.31 %)
Scratch size: 1508 -> 1492 (-1.06 %) dwords per thread
Code Size:
52094872 ->
52692540 (1.15 %) bytes
Max Waves: 371848 -> 372723 (0.24 %)
v2: - the shader cache needs to take address32_hi into account
- set amdgpu-32bit-address-high-bits
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v1)
Marek Olšák [Sun, 31 Dec 2017 21:58:57 +0000 (22:58 +0100)]
radeonsi: disallow constant buffers with a 64-bit address in slot 0
State trackers must use a user buffer or const_uploader,
or set pipe_resource::flags same as const_uploader->flags.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Marek Olšák [Sun, 31 Dec 2017 21:51:14 +0000 (22:51 +0100)]
radeonsi: move const_uploader allocations to 32-bit address space
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Marek Olšák [Sun, 31 Dec 2017 21:34:45 +0000 (22:34 +0100)]
winsys/radeon: implement and enable 32-bit VM allocations
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Marek Olšák [Sun, 31 Dec 2017 21:02:35 +0000 (22:02 +0100)]
winsys/radeon: add struct radeon_vm_heap
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Marek Olšák [Sun, 31 Dec 2017 20:36:37 +0000 (21:36 +0100)]
winsys/amdgpu: enable 32-bit VM allocations
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Marek Olšák [Sun, 31 Dec 2017 20:32:36 +0000 (21:32 +0100)]
gallium/radeon: add 32-bit address space heaps
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Marek Olšák [Fri, 2 Feb 2018 17:22:15 +0000 (18:22 +0100)]
ac: query high bits of 32-bit address space
Marek Olšák [Sat, 27 Jan 2018 01:05:23 +0000 (02:05 +0100)]
gallium: use PIPE_CAP_CONSTBUF0_FLAGS
Marek Olšák [Sat, 27 Jan 2018 00:52:08 +0000 (01:52 +0100)]
gallium: allow drivers to impose BO flags restrictions on constant buffer 0
Required by radeonsi for optimal behavior.
Alexander von Gluck IV [Fri, 16 Feb 2018 22:56:31 +0000 (16:56 -0600)]
meson: Add Haiku platform support v4
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Anuj Phogat [Thu, 15 Feb 2018 23:35:42 +0000 (15:35 -0800)]
anv/icl: Add render target flush after uploading binding table
The PIPE_CONTROL command description says:
"Whenever a Binding Table Index (BTI) used by a Render Taget Message
points to a different RENDER_SURFACE_STATE, SW must issue a Render
Target Cache Flush by enabling this bit. When render target flush
is set due to new association of BTI, PS Scoreboard Stall bit must
be set in this packet."
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Anuj Phogat [Tue, 13 Feb 2018 21:48:34 +0000 (13:48 -0800)]
anv/icl: Enable float blend optimization
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>