Brian Paul [Mon, 25 Feb 2019 21:51:37 +0000 (14:51 -0700)]
mesa: fix display list corner case assertion
This fixes a failed assertion in glDeleteLists() for the following
case:
list = glGenLists(1);
glDeleteLists(list, 1);
when those are the first display list commands issued by the
application.
When we generate display lists, we plug in empty lists created with
the make_list() helper. This function uses the OPCODE_END_OF_LIST
opcode but does not call dlist_alloc() which would set the
InstSize[OPCODE_END_OF_LIST] element to non-zero.
When the empty list was deleted, we failed the InstSize[opcode] > 0
assertion.
Typically, display lists are created with glNewList/glEndList so we
set InstSize[OPCODE_END_OF_LIST] = 1 in dlist_alloc(). That's why
this bug wasn't found before.
To fix this failure, simply initialize the InstSize[OPCODE_END_OF_LIST]
element in make_list().
The game oolite was hitting this.
Fixes: https://github.com/OoliteProject/oolite/issues/325
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Brian Paul [Fri, 1 Feb 2019 03:01:30 +0000 (20:01 -0700)]
svga: fix dma.pending > 0 test
The dma.pending field is boolean, so testing for > 0 isn't right.
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Brian Paul [Fri, 1 Feb 2019 02:58:30 +0000 (19:58 -0700)]
svga: assorted whitespace and formatting fixes
Remove trailing whitespace, etc.
Trivial.
Brian Paul [Mon, 25 Feb 2019 21:27:38 +0000 (14:27 -0700)]
st/mesa: whitespace/formatting fixes in st_cb_texture.c
Remove trailing whitespace, replace tabs w/ spaces, etc.
Trivial.
Eleni Maria Stea [Fri, 22 Feb 2019 21:02:30 +0000 (23:02 +0200)]
i965: fixed clamping in set_scissor_bits when the y is flipped
Calculating the scissor rectangle fields with the y flipped (0 on top)
can generate negative values that will cause assertion failure later on
as the scissor fields are all unsigned. We must clamp the bbox values
again to make sure they don't exceed the fb_height. Also fixed a
calculation error.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999
https://bugs.freedesktop.org/show_bug.cgi?id=109594
v2:
- I initially clamped the values inside the if (Y is flipped) case
and I made a mistake in the calculation: the clamp of the bbox[2] should
be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and I
shouldn't have changed the ScissorRectangleYMax calculation. As the
fixed code is equivalent with using CLAMP instead of MAX2 at the top of
the function when bbox[2] and bbox[3] are calculated, and the 2nd is more
clear, I replaced it. (Nanley Chery)
v3:
- Reversed the CLAMP change in bbox[3] as the API guarantees that the
viewport height is positive. (Nanley Chery)
v4:
- Added nomination for the mesa-stable branch and the link to the second
bugzilla bug (Nanley Chery)
CC: <mesa-stable@lists.freedesktop.org>
Tested-by: Paul Chelombitko <qamonstergl@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Eduardo Lima Mitev [Tue, 26 Feb 2019 07:48:46 +0000 (08:48 +0100)]
freedreno/a6xx: Silence compiler warnings
util_format_compose_swizzles() expects 'const unsigned char' and we
are feeding it 'char'.
Reviewed-by: Rob Clark <robdclark@gmail.com>
Kasireddy, Vivek [Wed, 13 Feb 2019 01:03:52 +0000 (17:03 -0800)]
i965: Add support for sampling from XYUV images
Add support to the i965 DRI driver to sample from XYUV8888 buffers.
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Kasireddy, Vivek [Wed, 13 Feb 2019 00:44:04 +0000 (16:44 -0800)]
dri: Add XYUV8888 format
In addition to adding this format to the dri_interface header,
add an entry in the android and wayland backends as well.
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Vivek Kasireddy [Thu, 21 Feb 2019 02:29:21 +0000 (18:29 -0800)]
drm-uapi: Update headers from drm-next
Pull new updates from drm-next as of the following commit:
commit
a5f2fafece141ef3509e686cea576366d55cabb6
Merge:
71f4e45a4ed3 860433ed2a55
Author: Dave Airlie <airlied@redhat.com>
Date: Wed Feb 20 12:16:30 2019 +1000
Merge https://gitlab.freedesktop.org/drm/msm into drm-next
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Kasireddy, Vivek [Wed, 13 Feb 2019 00:02:20 +0000 (16:02 -0800)]
nir/lower_tex: Add support for XYUV lowering
The memory layout associated with this format would be:
Byte: 0 1 2 3
Component: V U Y X
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Lionel Landwerlin [Mon, 25 Feb 2019 10:58:40 +0000 (10:58 +0000)]
imgui: update memory editor
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Lionel Landwerlin [Mon, 25 Feb 2019 10:54:50 +0000 (10:54 +0000)]
imgui: update commit
In commit
3950e7c11efc86 ("imgui: bump copy") I forgot to update the
README about what copy of imgui we carry.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Eric Engestrom [Tue, 22 Jan 2019 16:49:29 +0000 (16:49 +0000)]
driinfo: add DTD to allow the xml to be validated
This DTD can be used to validate the output and make sure any parsers
out there can handle it:
$ xmllint --noout --valid driinfo.xml
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Lionel Landwerlin [Fri, 22 Feb 2019 12:54:53 +0000 (12:54 +0000)]
vulkan/overlay: fix includes
The Loader/Validation-Layers repository allow the user to choose where
header files are installed. On my system I choose /usr/include
thinking it was the obvious "base" location, but it turns out the
headers end up being installed right there rather in a vulkan
subdirectory. On Debian/Ubuntu the selected installation path is
/usr/include/vulkan, so just go with that.
Hopefully other distro don't choose another path.
Note that the validation layer doesn't provide a .pc file so we have
no way of querying where the headers are installed.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109739
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Lionel Landwerlin [Fri, 22 Feb 2019 12:54:13 +0000 (12:54 +0000)]
vulkan/overlay: fix missing installation of layer
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109739
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Eric Engestrom [Wed, 20 Feb 2019 17:49:14 +0000 (17:49 +0000)]
dri_interface: add missing #include
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Eric Engestrom [Fri, 22 Feb 2019 15:52:08 +0000 (15:52 +0000)]
gitlab-ci: always run the containers build
If the first time a fork was created, the job creating the containers was
manually cancelled, this would have left the fork unable to use the CI
(until the next automatic regeneration of the container).
Avoid this by always running the container-generation job, even though
99% of the time it will spin up, see that the container exists and shut
down.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Emil Velikov [Mon, 25 Feb 2019 11:57:20 +0000 (11:57 +0000)]
docs: mention "Allow commits from members who can merge..."
Mention the tick-box otherwise only the MR author can rebase the series.
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reivewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Emil Velikov [Tue, 19 Feb 2019 15:30:41 +0000 (15:30 +0000)]
egl/android: bump the number of drmDevices to 64
It's the current maximum supported by the kernel. Stay consistent with
the rest of Mesa and use the same number.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Tue, 19 Feb 2019 15:30:39 +0000 (15:30 +0000)]
loader: use loader_open_device() to handle O_CLOEXEC
Some platforms lack O_CLOEXEC. The loader_open_device() handles those
appropriately, so use the helper.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Emil Velikov [Thu, 14 Feb 2019 11:23:58 +0000 (11:23 +0000)]
meson: egl: correctly manage loader/xmlconfig
Earlier commit introduced support for haiku yet did not properly
annotate the loader/xmlconfig dependencies.
Thus we ended up adding inc_loader for each !haiku platform - see
659910eda01 9a96bf0ecd0 c731508b988 ec6cb01e216.
One piece remained though - the wayland platform. Hence the following
would fail:
meson -Dgallium-drivers=etnaviv -Ddri-drivers=''\
-Dtools=etnaviv -Dplatforms=wayland -Dglx=disabled \
build/
Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Reported-by: Boris Brezillon <boris.brezillon@collabora.com>
Fixes: 834d221512f ("meson: Add Haiku platform support v4")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Emil Velikov [Tue, 5 Feb 2019 15:19:46 +0000 (15:19 +0000)]
egl/dri: de-duplicate dri2_load_driver*
The difference between the three functions is the list of mandatory
driver extensions. Pass that as an argument to the common helper.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Samuel Pitoiset [Mon, 25 Feb 2019 14:28:25 +0000 (15:28 +0100)]
radv: don't copy buffer descriptors list for samplers
Sampler descriptors don't have a buffer list.
This fixes some crashes with new CTS
dEQP-VK.binding_model.descriptor_copy.*.sampler_*.
Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Mon, 25 Feb 2019 14:28:24 +0000 (15:28 +0100)]
radv: fix out-of-bounds access when copying descriptors BO list
We shouldn't increment the buffer list pointers twice.
This fixes some crashes with new CTS
dEQP-VK.binding_model.descriptor_copy.*.
Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tapani Pälli [Mon, 25 Feb 2019 11:34:09 +0000 (13:34 +0200)]
nir: use nir_variable_create instead of open-coding the logic
Fixes: 3d7611e9 "st/nir: use NIR for asm programs"
Reported-by: Matthias Lorenz <oschowa@web.de>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tapani Pälli [Mon, 25 Feb 2019 09:14:11 +0000 (11:14 +0200)]
nir: initialize value in copy_prop_vars_block
Fixes following valgrind warning:
==27561== Conditional jump or move depends on uninitialised value(s)
==27561== at 0x667856B: value_set_ssa_components (nir_opt_copy_prop_vars.c:78)
==27561== by 0x667A1C4: copy_prop_vars_block (nir_opt_copy_prop_vars.c:797)
Fixes: 62332d139c8 "nir: Add a local variable-based copy propagation pass"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Eric Anholt [Mon, 25 Feb 2019 23:36:26 +0000 (15:36 -0800)]
v3d: Rematerialize MOVs of uniforms instead of spilling them.
If we have a MOV of a uniform value available to spill, that's one of our
best choices. We can just not spill the value, and emit a new load of the
uniform as the fill. This saves bothering the TMU and the thrsw, and is
the same cost in uniforms (since the spill offset is a uniform anyway).
This doesn't have a huge impact on shader-db, since there aren't a whole
lot of spills and we usually copy-prop the uniforms at the VIR level such
that the only uniform MOVs are from vir_lower_uniforms:
total instructions in shared programs:
6430292 ->
6430279 (<.01%)
total uniforms in shared programs:
2386023 ->
2385787 (<.01%)
total spills in shared programs: 4961 -> 4960 (-0.02%)
total fills in shared programs: 6352 -> 6350 (-0.03%)
However, I'm interested in dropping the uniforms copy-prop in the backend,
since it would be cheaper to not load repeated uniforms if we have the
registers to spare. This also saves many spills on
dEQP-GLES31.functional.ubo.random.all_per_block_buffers.20, which is what
motivated a bunch of my recent backend work in the first place:
before: 46 spills, 106 fills, 3062 instructions
after: 0 spills, 0 fills, 2611 instructions
Eric Anholt [Tue, 26 Feb 2019 00:27:41 +0000 (16:27 -0800)]
v3d: Dump the VIR after register spilling if we were forced to.
Spilling is unusual, but one often has to debug it when it happens, so
dump it.
Eric Anholt [Tue, 26 Feb 2019 02:01:08 +0000 (18:01 -0800)]
v3d: Fix vir_is_raw_mov() for input unpacks.
There are no users at the moment, but I wanted to start using this in
register spilling.
Mathias Fröhlich [Sat, 22 Dec 2018 15:49:16 +0000 (16:49 +0100)]
st/mesa: Reduce array updates due to current changes.
Since using bitmasks we can easily check if we have any
current value that is potentially uploaded on array setup.
So check for any potential vertex program input that is not
already a vao enabled array. Only flag array update if there is
a potential overlap.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Dylan Baker [Thu, 20 Dec 2018 17:54:17 +0000 (09:54 -0800)]
meson/iris: Use current coding style
Just a few minor style things.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Timothy Arceri [Sun, 24 Feb 2019 23:55:57 +0000 (10:55 +1100)]
radeonsi: fix query buffer allocation
Fix the logic for buffer full check on alloc.
This patch just takes the fix Nicolai attached to the bug report
and updates it to work on master.
Fixes: e0f0d3675d4 ("radeonsi: factor si_query_buffer logic out of si_query_hw")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109561
Eric Anholt [Mon, 25 Feb 2019 19:22:10 +0000 (11:22 -0800)]
nir: Just return when asked to rewrite uses of an SSA def to itself.
The nir_builder swizzling improvement to not emit extra MOVs resulted in
nir_lower_tex() trying to rewrite an SSA def to itself, triggering the
assert on all texturing in v3d. There's no work to be done in this case,
so just stop asserting.
Fixes: 743700be1f58 ("nir/builder: Don't emit no-op swizzles")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Samuel Pitoiset [Mon, 25 Feb 2019 11:14:59 +0000 (12:14 +0100)]
radv: fix clearing attachments in secondary command buffers
If no framebuffer is bound, get the number of samples and the
image format from the render pass.
This fixes new CTS dEQP-VK.geometry.layered.*.secondary_cmd_buffer.
Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Alok Hota [Tue, 19 Feb 2019 20:29:35 +0000 (14:29 -0600)]
swr/rast: Fix autotools and scons codegen
Use new input flags for gen_archrast.py
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Alok Hota [Mon, 17 Sep 2018 19:50:47 +0000 (14:50 -0500)]
swr/rast: Add general SWTag statistics
Update Archrast parser to use stats, used with an internal tool
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Alok Hota [Fri, 7 Sep 2018 20:17:53 +0000 (15:17 -0500)]
swr/rast: Add string handling to AR event framework
For use by an internal tool
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Alok Hota [Tue, 4 Sep 2018 18:41:39 +0000 (13:41 -0500)]
swr/rast: Add initial SWTag proto definitions
Update gen_archrast.py to properly generate event IDs
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Alok Hota [Fri, 31 Aug 2018 17:13:56 +0000 (12:13 -0500)]
swr/rast: Cleanup and generalize gen_archrast
Update meson.build to accomodate
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Daniel Schürmann [Fri, 22 Feb 2019 21:05:07 +0000 (22:05 +0100)]
nir: Use SM5 properties to optimize shift(a@32, iand(31, b))
This is a common pattern from HLSL->SPIRV translation
and supported in HW by all current NIR backends.
vkpipeline-db results anv (SKL):
total instructions in shared programs:
6403130 ->
6402380 (-0.01%)
instructions in affected programs: 204084 -> 203334 (-0.37%)
helped: 208
HURT: 0
total cycles in shared programs:
1915629582 ->
1918198408 (0.13%)
cycles in affected programs:
1158892682 ->
1161461508 (0.22%)
helped: 107
HURT: 86
shader-db results on i965 (KBL):
total instructions in shared programs:
15284592 ->
15284568 (<.01%)
instructions in affected programs: 81683 -> 81659 (-0.03%)
helped: 24
HURT: 0
total cycles in shared programs:
375013622 ->
375013932 (<.01%)
cycles in affected programs:
40169618 ->
40169928 (<.01%)
helped: 13
HURT: 9
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Daniel Schürmann [Thu, 14 Feb 2019 07:19:09 +0000 (08:19 +0100)]
nir: Define shifts according to SM5 specification.
SPIR-V shifts are undefined for values >= bitsize, but SM5 shifts
are defined to only use the least significant bits.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Jason Ekstrand [Thu, 7 Feb 2019 23:45:51 +0000 (17:45 -0600)]
intel/eu: Add an EOT parameter to send_indirect_[split]_message
For split indirect sends we have to put the EOT parameter in the
extended descriptor as well as the instruction itself so just calling
brw_inst_set_eot is insufficient. Moving the EOT handling handling into
the send_indirect_[split]_message helper lets us handle it properly.
Sergii Romantsov [Fri, 22 Feb 2019 09:23:08 +0000 (11:23 +0200)]
d3d: meson: do not prefix user provided d3d-drivers-path
The user can select the location where there d3d drivers
are installed by the d3d-drivers-path meson option.
By default path will be $prefix/$libdir/d3d.
Currently we add $prefix to the user provided path.
Resulting in an incorrect or even missing path.
Based on logic of
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109698
CC: Kenneth Graunke <kenneth@whitecape.org>
CC: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Sergii Romantsov [Thu, 21 Feb 2019 08:28:11 +0000 (10:28 +0200)]
dri: meson: do not prefix user provided dri-drivers-path
The user can select the location where there dri drivers
are installed by the dri-drivers-path meson option.
By default path will be $prefix/$libdir/dri.
Currently we add $prefix to the user provided path.
Resulting in an incorrect or even missing path.
v2: fixed dri_search_path by default, rebased to master
v3: new commit-message (Emil Velikov), cc mesa-stable
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109698
CC: Rafael Antognolli <rafael.antognolli@intel.com>
CC: Dylan Baker <dylan@pnwbakers.com>
Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Fixes: 306914db92e1 (meson: Add dridriverdir variable to dri.pc.)
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Lionel Landwerlin [Mon, 25 Feb 2019 10:50:59 +0000 (10:50 +0000)]
intel/aub_viewer: silence more compiler warnings
format not a string literal and no format arguments.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Lionel Landwerlin [Mon, 25 Feb 2019 10:48:52 +0000 (10:48 +0000)]
intel/aub_viewer: silence compiler warning
buffer_addr may be used uninitialized.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Lionel Landwerlin [Mon, 25 Feb 2019 10:47:55 +0000 (10:47 +0000)]
intel/aub_viewer: printout 48bits addresses
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Gert Wollny [Wed, 2 Jan 2019 14:44:33 +0000 (15:44 +0100)]
mesa/core: Enable EXT_depth_clamp for GLES >= 2.0
The extension NV_depth_clamp is written against OpenGL 1.2.1, and
since GLES 2.0 is based on GL 2.0 there is no reason not to enable
this extension also for GLES >= 2.0.
v2: Use EXT_depth_clamp that has been proposed to Khronos
v3: - Fix check for extension availability (Erik Faya-Lund)
- Also fix the test in is_enabled
v4: - Test both, ARB and EXT extension (Erik)
v5: - Fix white space errors (Erik)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Kenneth Graunke [Fri, 22 Feb 2019 07:37:58 +0000 (23:37 -0800)]
iris: Properly allow rendering to RGBX formats.
I was converting them at pipe_surface creation time, but not when
answering queries about whether formats support rendering. This caused
a lot of FBO incomplete errors for formats that ought to be supported.
Fixes "Child of Light", which uses PIPE_FORMAT_R8G8B8X8_UNORM_SRGB.
Also fixes Witcher 1 using wined3d (GL) according to Timur Kristóf.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109738
Kenneth Graunke [Fri, 22 Feb 2019 05:36:05 +0000 (21:36 -0800)]
iris: Drop RGBX -> RGBA for storage image usages
GLSL doesn't expose RGB/RGBX image formats, so this isn't needed.
Kenneth Graunke [Fri, 22 Feb 2019 09:16:41 +0000 (01:16 -0800)]
mesa: Fix RGBBuffers for renderbuffers with sized internal formats
For texture attachments, 'f' is texImg->_BaseFormat, but for
renderbuffer attachments, 'f' is att->Renderbuffer->InternalFormat.
InternalFormat may be something like GL_RGB8, which causes our
(f == GL_RGB) check to fail. Switch to using a proper _BaseFormat,
which drops the size.
Fixes dEQP-GLES31.functional.draw_buffers_indexed.random.
max_required_draw_buffers.15 on iris when combined with a driver fix.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Oscar Blumberg [Mon, 11 Feb 2019 16:46:20 +0000 (17:46 +0100)]
glsl: Fix function return typechecking
apply_implicit_conversion only converts and check base types but we
need actual type equality for function returns, otherwise you can
return a vec2 from a function declared as returning a float.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Jordan Justen [Sun, 24 Feb 2019 22:21:39 +0000 (14:21 -0800)]
iris: Always use in-tree i915_drm.h
Ref:
f1374805a86 "drm-uapi: use local files, not system libdrm"
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Alyssa Rosenzweig [Sun, 24 Feb 2019 06:22:23 +0000 (06:22 +0000)]
panfrost: Decode render target swizzle/channels
On MRT-capable systems, the framebuffer format is encoded as a 64-bit
word in the render target descriptor. Previously, the two 32-bit
words were exposed as opaque hex values. This commit identifies a 12-bit
Mali swizzle and a 2-bit channel counter, removing some of the magic. It
also adds decoding support for the AFBC and MSAA enable bits, which were
already known but otherwise ignored in pandecode.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Alyssa Rosenzweig [Sat, 23 Feb 2019 01:12:10 +0000 (01:12 +0000)]
panfrost/midgard: Add fround(_even), ftrunc, ffma
These ops were discovered by invoking the correspondingly names GLSL
functions. The rounding ops here behave exact as expected and are mapped
to their corresponding NIR ops where applicable. The ffma behaves as a
LUT instruction and requires some special argument packing (since
Midgard normally only allows for 2 arguments); this quirk will be
addressed in the future, but for now FMA is still lowered.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Alyssa Rosenzweig [Mon, 18 Feb 2019 23:32:05 +0000 (23:32 +0000)]
panfrost/nondrm: Split out dump_counters
Previously, this function was implied a part of the job submit.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Alyssa Rosenzweig [Mon, 25 Feb 2019 02:32:45 +0000 (02:32 +0000)]
panfrost/nondrm: Make COHERENT_LOCAL explicit
This flag corresponds to what was MEM_COHERENT_LOCAL in the vendor
driver, which seems to influence the cache policy, necessary for the
varying temporary storage but nothing else.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Alyssa Rosenzweig [Mon, 25 Feb 2019 02:31:09 +0000 (02:31 +0000)]
panfrost/nondrm: Flag CPU-invisible regions
Potentially, the kernel could optimize these allocations, or perhaps we
can save on mapping costs.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Alyssa Rosenzweig [Fri, 22 Feb 2019 23:08:59 +0000 (23:08 +0000)]
panfrost/meson: Remove subdir for nondrm
This change fixes cross builds with the (temporary) non-DRM overlay.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Alyssa Rosenzweig [Thu, 21 Feb 2019 05:57:29 +0000 (05:57 +0000)]
panfrost: Use tiler fast path (performance boost)
For reasons that are still unclear (speculation included in the comment
added in this patch), the tiler? metadata has a fast path that we were
not enabling; there looks to be a possible time/memory tradeoff, but the
details remain unclear.
Regardless, this patch improves performance dramatically. Particular
wins are for geometry-heavy scenes. For instance, glmark2-es2's
Phong-shaded bunny, rendering at fullscreen (2400x1600) via GBM, jumped
from ~20fps to hitting vsync cap at 60fps. Gains are even more obvious
when vsync is disabled, as in glmark2-es2-wayland.
With this patch, on GLES 2.0 samples not involving FBOs, it appears
performance is converging with (and sometimes surpassing) the blob.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Jason Ekstrand [Fri, 22 Feb 2019 23:06:39 +0000 (17:06 -0600)]
nir/builder: Don't emit no-op swizzles
The nir_swizzle helper is used some on it's own but it's also called by
nir_channel and nir_channels which are used everywhere. It's pretty
quick to check while we're walking the swizzle anyway whether or not
it's an identity swizzle. If it is, we now don't bother emitting the
instruction. Sure, copy-prop will clean it up for us but there's no
sense making more work for the optimizer than we have to.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Jason Ekstrand [Sat, 23 Feb 2019 04:10:55 +0000 (22:10 -0600)]
nir/split_vars: Don't compact vectors unnecessarily
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Erik Faye-Lund [Wed, 20 Feb 2019 20:50:50 +0000 (21:50 +0100)]
st/mesa: remove unused header-file
This header has been unused since
f8f2520e88c ("st/mesa: Remove
unnecessary headers"). And in the more than 8 years since, this
hasn't been useful. So let's just get rid of it.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Maya Rashish [Thu, 10 Jan 2019 14:18:48 +0000 (16:18 +0200)]
configure: fix test portability
From the bash manual:
string1 == string2
string1 = string2
True if the strings are equal. = should be used with the test
command for POSIX conformance.
David Shao [Sun, 24 Feb 2019 09:00:36 +0000 (09:00 +0000)]
meson: ensure that xmlpool_options.h is generated for gallium targets that need it
Fixes: 68076b87474e7959c161 "meson: build gallium vdpau state tracker"
Fixes: 22a817af8a89eb3c762f "meson: build gallium xvmc state tracker"
Fixes: 5a785d51a6d68ec676ce "meson: build gallium va state tracker"
Fixes: 0ba909f0f111824223bc "meson: build gallium xa state tracker"
Fixes: 1d36dc674d528b93bec3 "meson: build gallium omx state tracker"
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Matthias Lorenz [Fri, 22 Feb 2019 23:08:28 +0000 (00:08 +0100)]
vulkan/overlay: Add fps counter
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109747
Lionel Landwerlin [Sun, 24 Feb 2019 01:06:39 +0000 (01:06 +0000)]
Revert "anv: add support for INTEL_DEBUG=bat"
This reverts commit
e4d88396d259c4ec6032d2834d1c9073d55e9b45.
Apologies, I pushed the wrong commit.
Lionel Landwerlin [Sat, 23 Feb 2019 23:27:17 +0000 (23:27 +0000)]
anv: add support for INTEL_DEBUG=bat
As requested by Ken ;)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Christian Gmeiner [Fri, 22 Feb 2019 10:10:29 +0000 (11:10 +0100)]
etnaviv: blt: mark used src resource as read from
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Christian Gmeiner [Fri, 22 Feb 2019 10:02:34 +0000 (11:02 +0100)]
etnaviv: rs: mark used src resource as read from
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Vinson Lee [Tue, 19 Feb 2019 03:27:27 +0000 (19:27 -0800)]
gallium/auxiliary/vl: Fix duplicate symbol build errors.
CXXLD gallium_dri.la
duplicate symbol _compute_shader_video_buffer in:
../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor.o)
../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor_cs.o)
duplicate symbol _compute_shader_weave in:
../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor.o)
../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor_cs.o)
duplicate symbol _compute_shader_rgba in:
../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor.o)
../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor_cs.o)
Fixes: 9364d66cb7f7 ("gallium/auxiliary/vl: Add video compositor compute shader render")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Caio Marcelo de Oliveira Filho [Sat, 23 Feb 2019 06:38:05 +0000 (22:38 -0800)]
nir: fix MSVC build
Zero initialize struct with {0} instead of {}.
Caio Marcelo de Oliveira Filho [Mon, 14 Jan 2019 21:52:36 +0000 (13:52 -0800)]
nir/copy_prop_vars: add tests for load/store elements of vectors
Test using array deref on vectors in loads and stores. These are
marked DISABLED_ as this optimization is currently not done.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Caio Marcelo de Oliveira Filho [Tue, 15 Jan 2019 00:10:44 +0000 (16:10 -0800)]
nir: nir_build_deref_follower accept array derefs of vectors
Code itself already supports it, just make sure we can use it for
those cases.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Caio Marcelo de Oliveira Filho [Mon, 14 Jan 2019 21:33:00 +0000 (13:33 -0800)]
nir/copy_prop_vars: change test helper to get intrinsics
Replace find_next_intrinsic(intrinsic, after) with
get_intrinsic(intrinsic, index). This makes slightly more convenient
to check the resulting loads/stores/copies, since in most tests we
know which one we care about. The cost is to perform more traversals,
but for such tests this is not a problem.
Added the ASSERT_EQ() on count to some tests missing it, so the
indices queried are always expected to find something.
Also, drop two nir_print_shader leftover calls in a test.
v2: Remove redundant assertions. nir_src_comp_as_uint already
assert what we need. (Jason)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Caio Marcelo de Oliveira Filho [Mon, 14 Jan 2019 20:26:30 +0000 (12:26 -0800)]
nir/copy_prop_vars: keep track of components in copy_entry
When a copy_entry is SSA, store not only the nir_ssa_def* for each
component, but also the source component they come from. At the
moment this is always a match (i.e. 'component[i] == i'), because all
the operations for a copy_entry happen using definitions with the same
size. This prepares the code for array_derefs of vectors, in which
'component[i] != i'.
Also, extract setting all SSA components into a function of its own.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Caio Marcelo de Oliveira Filho [Sat, 12 Jan 2019 00:27:24 +0000 (16:27 -0800)]
nir/copy_prop_vars: add debug helpers
Disabled by default, to be used during development. Adding those
so I don't rewrite some ad-hoc version of them everytime I'm working
with this pass.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Caio Marcelo de Oliveira Filho [Tue, 8 Jan 2019 23:53:02 +0000 (15:53 -0800)]
nir/copy_prop_vars: don't get confused by array_deref of vectors
For now these derefs are not handled, so don't let these get into the
copies list -- which would cause wrong propagations. For load_derefs,
do nothing. For store_derefs, invalidate whatever the store is
writing to. For copy_derefs, invalidate whatever the copy is writing
to.
These cases will happen once derefs to SSBOs/UBOs are kept around long
enough to get optimized by copy_prop_vars.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Timothy Arceri [Fri, 22 Feb 2019 05:59:13 +0000 (16:59 +1100)]
nir: allow nir_lower_phis_to_scalar() on more src types
Rather than only lowering if all srcs are scalarizable we instead
check that at least one src is scalarizable.
We change undef type to return false otherwise it will cause
regressions when it is the only scalarizable src.
total instructions in shared programs:
13219105 ->
13024547 (-1.47%)
instructions in affected programs:
1153797 -> 959239 (-16.86%)
helped: 581
HURT: 74
total cycles in shared programs:
333968972 ->
324807922 (-2.74%)
cycles in affected programs:
129809402 ->
120648352 (-7.06%)
helped: 571
HURT: 131
total spills in shared programs: 57947 -> 29130 (-49.73%)
spills in affected programs: 53364 -> 24547 (-54.00%)
helped: 351
HURT: 0
total fills in shared programs: 51310 -> 25468 (-50.36%)
fills in affected programs: 44882 -> 19040 (-57.58%)
helped: 351
HURT: 0
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Alok Hota [Thu, 21 Feb 2019 20:41:15 +0000 (14:41 -0600)]
swr/rast: bypass size limit for non-sampled textures
This fixes a bug where SWR will fail to render in cases with large
buffer allocations, e.g. very large meshes whose vertex buffers exceed
2GB
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Marek Olšák [Wed, 20 Feb 2019 22:21:32 +0000 (17:21 -0500)]
tgsi: don't set tgsi_info::uses_bindless_images for constbufs and hw atomics
This might have decreased performance for radeonsi/tgsi, because most
most shaders claimed they used bindless.
Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Jordan Justen [Sun, 17 Feb 2019 01:39:45 +0000 (17:39 -0800)]
iris: Add gitlab-ci build testing
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Rob Clark [Fri, 22 Feb 2019 18:09:25 +0000 (13:09 -0500)]
freedreno/a6xx: cube image fix
Note that emit_intrinsic_load_image() already swaps a .3d flag with an
.a flag. I tried doing things the other way around (going back to .3d)
but that didn't work. And treating cube images as 2d array is also what
blob does, so let's just go with that.
Fixes dEQP-GLES31.functional.image_load_store.cube.load_store.*
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Thu, 21 Feb 2019 20:44:35 +0000 (15:44 -0500)]
freedreno/a6xx: fix border-color offset
Fixes nearly all of dEQP-GLES31.functional.texture.border_clamp.* when
run after a test that binds textures used in vertex shader.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Thu, 21 Feb 2019 19:46:10 +0000 (14:46 -0500)]
freedreno/ir3: don't hardcode wrmask
Fixes dEQP-GLES31.functional.shaders.opaque_type_indexing.sampler.const_literal.vertex.samplercubeshadow
and few other similar tests that do multiple texture fetches into
individual components of a packet output. Mostly works around the
issue mentioned in ra_block_find_definers().
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Tue, 19 Feb 2019 14:29:49 +0000 (09:29 -0500)]
freedreno: fix race condition
rsc->write_batch can be cleared behind our back, so we need to acquire
the lock *before* deref'ing.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Kenneth Graunke [Fri, 22 Feb 2019 03:07:29 +0000 (19:07 -0800)]
vulkan: Fix 32-bit build for the new overlay layer
vulkan_core.h defines non-dispatchable handles as (struct object *)
on 64-bit systems, but uint64_t on 32-bit systems. The former can be
implicitly cast to void *, but the latter requires an explicit cast.
While here, %lu is the wrong format specifier for uint64_t on 32-bit
systems, so use PRIu64, fixing a warning.
Reported-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Juan A. Suarez Romero [Fri, 22 Feb 2019 15:47:53 +0000 (16:47 +0100)]
anv: advertise 8 subpixel precision bits
On one side, when emitting 3DSTATE_SF, VertexSubPixelPrecisionSelect is
used to select between 8 bit subpixel precision (value 0) or 4 bit
subpixel precision (value 1). As this value is not set, means it is
taking the value 0, so 8 bit are used.
On the other side, in the Vulkan CTS tests, if the reference rasterizer,
which uses 8 bit precision, as it is used to check what should be the
expected value for the tests, is changed to use 4 bit as ANV was
advertising so far, some of the tests will fail.
So it seems ANV is actually using 8 bits.
v2: explicitly set 3DSTATE_SF::VertexSubPixelPrecisionSelect (Jason)
v3: use _8Bit definition as value (Jason)
v4: (by Jason)
anv: Explicitly set 3DSTATE_CLIP::VertexSubPixelPrecisionSelect
This field was added on gen8 even though there's an identically defined
one in 3DSTATE_SF.
CC: Jason Ekstrand <jason@jlekstrand.net>
CC: Kenneth Graunke <kenneth@whitecape.org>
CC: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Juan A. Suarez Romero [Fri, 22 Feb 2019 15:16:24 +0000 (16:16 +0100)]
genxml: add missing field values for 3DSTATE_SF
Fill out "Vertex Sub Pixel Precision Select" possible values.
CC: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bas Nieuwenhuizen [Fri, 22 Feb 2019 13:24:28 +0000 (14:24 +0100)]
radv: Allow interpolation on non-float types.
In particular structs containing floats and 16-bit floating point
types.
Fixes: 62024fa7750 "radv: enable VK_KHR_16bit_storage extension / 16bit storage features"
Fixes: da295946361 "spirv: Only split blocks"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109735
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Bas Nieuwenhuizen [Fri, 22 Feb 2019 13:16:08 +0000 (14:16 +0100)]
radv: Fix float16 interpolation set up.
float16 types can have non-flat interpolation so set up the HW
correctly for that.
Fixes: 62024fa7750 "radv: enable VK_KHR_16bit_storage extension / 16bit storage features"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Ilia Mirkin [Fri, 22 Feb 2019 14:40:37 +0000 (09:40 -0500)]
nv50: disable compute
It causes more trouble than it's worth. Now vl tries to create compute
shaders without all the proper checking. Since there's really no
(current) way to use compute on nv50, just mark it disabled.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109742
Fixes: f6ac0b5d71 ("gallium/auxiliary/vl: Add compute shader to support video compositor render")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Lionel Landwerlin [Wed, 20 Feb 2019 12:49:17 +0000 (12:49 +0000)]
intel: fix urb size for CFL GT1
Same 192Kb amount as SKL/KBL GT1 applies.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Fixes: de7ed0ba5522 ("i965/CFL: Add PCI Ids for Coffee Lake.")
Samuel Iglesias Gonsálvez [Tue, 19 Feb 2019 12:06:25 +0000 (13:06 +0100)]
isl: the display engine requires 64B alignment for linear surfaces
v2: Add PRM quote (Lionel)
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Gert Wollny [Thu, 14 Feb 2019 14:21:30 +0000 (15:21 +0100)]
virgl: Enable mixed color FBO attachemnets only when the host supports
it
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Mauro Rossi [Thu, 21 Feb 2019 23:30:38 +0000 (00:30 +0100)]
android: intel/isl: remove redundant building rules
Fixes the following building error:
including ./external/mesa/Android.mk ...
build/core/base_rules.mk:183: *** external/mesa/src/intel:
MODULE.TARGET.STATIC_LIBRARIES.libmesa_isl_tiled_memcpy already defined by external/mesa/src/intel.
make: *** [build/core/ninja.mk:164: out/build-android_x86_64.ninja] Error 1
ISL_TILED_MEMCPY_FILES is isl/isl_tiled_memcpy_normal.c
and that source file includes isl_tiled_memcpy.c source
Fixes: 96bb328 ("iris: add Android build")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Kenneth Graunke [Thu, 21 Feb 2019 23:50:14 +0000 (15:50 -0800)]
Revert "iris: Enable auxiliary buffer support"
This reverts commit
cd0ced49e7957182d23e21657445b720184ea425.
It breaks glxgears rendering.
Kenneth Graunke [Thu, 21 Feb 2019 22:29:00 +0000 (14:29 -0800)]
iris: Enable -msse2 and -mstackrealign
This is needed for gen_clflush.h intrinsics to work on 32-bit builds.
i965 and anv both set these, and iris needs to as well.
Tested-by: Mark Janes <mark.a.janes@intel.com>
Francisco Jerez [Fri, 18 Jan 2019 19:38:17 +0000 (11:38 -0800)]
intel/fs: Rely on undocumented unrestricted regioning for 32x16-bit integer multiply.
Even though the hardware spec claims that any "integer DWord multiply"
operation is affected by the regioning restrictions of CHV/BXT/GLK,
this is inconsistent with the behavior of the simulator and with
empirical evidence -- Return false from has_dst_aligned_region_restriction()
for such instructions as a micro-optimization.
Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Francisco Jerez [Fri, 18 Jan 2019 20:51:57 +0000 (12:51 -0800)]
intel/fs: Implement extended strides greater than 4 for IR source regions.
Strides up to 32B can be implemented for the source regions of most
instructions by leveraging either the vertical or the horizontal
stride of the hardware Align1 region. The main motivation for this is
that currently the lower_integer_multiplication() pass will happily
double the stride of one of the 32-bit sources, which can blow up if
the stride of the original source was already the maximum value
allowed by the hardware.
An alternative would be to use the regioning legalization pass in
order to lower such strides into the composition of multiple legal
strides, but that would be somewhat less efficient.
This showed up as a regression from my commit
cbea91eb57a501bebb1ca2
in Vulkan 1.1 CTS tests on CHV/BXT platforms, however it was really a
pre-existing problem that had affected conformance on other platforms
without native support for integer multiplication. CHV/BXT were
getting around it because the code I removed in that commit had the
"fortunate" side effect of emitting narrower regions that didn't hit
the hardware stride limit after lowering. Beyond fixing the
regression this fixes ~90 additional Vulkan 1.1 subgroup CTS tests on
ICL (that's why this patch is marked for inclusion in mesa-stable even
though the original regressing patch was not).
According to Jason, a nearly equivalent change had been committed
previously as
e8c9e65185de3e821e1 and then (mistakenly?) reverted as
a31d0382084c8aa8.
Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109328
Reported-by: Mark Janes <mark.a.janes@intel.com>
Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>