mesa.git
6 years agonir: Narrow some dot product operations
Ian Romanick [Thu, 15 Feb 2018 22:49:55 +0000 (14:49 -0800)]
nir: Narrow some dot product operations

On vector platforms, this helps elide some constant loads.

v2: Reorder the transformations.

No changes on Broadwell or Skylake.

Haswell
total instructions in shared programs: 13093793 -> 13060163 (-0.26%)
instructions in affected programs: 1277532 -> 1243902 (-2.63%)
helped: 13216
HURT: 95
helped stats (abs) min: 1 max: 18 x̄: 2.56 x̃: 2
helped stats (rel) min: 0.21% max: 20.00% x̄: 3.63% x̃: 2.78%
HURT stats (abs)   min: 1 max: 6 x̄: 1.77 x̃: 1
HURT stats (rel)   min: 0.09% max: 5.56% x̄: 1.25% x̃: 1.19%
95% mean confidence interval for instructions value: -2.57 -2.49
95% mean confidence interval for instructions %-change: -3.65% -3.54%
Instructions are helped.

total cycles in shared programs: 409580819 -> 409268463 (-0.08%)
cycles in affected programs: 71730652 -> 71418296 (-0.44%)
helped: 9898
HURT: 2352
helped stats (abs) min: 2 max: 16014 x̄: 37.08 x̃: 16
helped stats (rel) min: <.01% max: 35.55% x̄: 6.26% x̃: 4.50%
HURT stats (abs)   min: 2 max: 276 x̄: 23.25 x̃: 6
HURT stats (rel)   min: <.01% max: 40.00% x̄: 3.54% x̃: 1.97%
95% mean confidence interval for cycles value: -33.19 -17.80
95% mean confidence interval for cycles %-change: -4.50% -4.26%
Cycles are helped.

total fills in shared programs: 82059 -> 82052 (<.01%)
fills in affected programs: 21 -> 14 (-33.33%)
helped: 7
HURT: 0

Sandy Bridge and Ivy Bridge had similar results (Ivy Bridge shown)
total instructions in shared programs: 11811851 -> 11780605 (-0.26%)
instructions in affected programs: 1155007 -> 1123761 (-2.71%)
helped: 12304
HURT: 95
helped stats (abs) min: 1 max: 18 x̄: 2.55 x̃: 2
helped stats (rel) min: 0.21% max: 20.00% x̄: 3.69% x̃: 2.86%
HURT stats (abs)   min: 1 max: 6 x̄: 1.77 x̃: 1
HURT stats (rel)   min: 0.09% max: 5.56% x̄: 1.25% x̃: 1.19%
95% mean confidence interval for instructions value: -2.56 -2.48
95% mean confidence interval for instructions %-change: -3.71% -3.59%
Instructions are helped.

total cycles in shared programs: 257618409 -> 257316805 (-0.12%)
cycles in affected programs: 71999580 -> 71697976 (-0.42%)
helped: 9155
HURT: 2380
helped stats (abs) min: 2 max: 16014 x̄: 38.44 x̃: 16
helped stats (rel) min: <.01% max: 35.75% x̄: 6.39% x̃: 4.62%
HURT stats (abs)   min: 2 max: 290 x̄: 21.14 x̃: 4
HURT stats (rel)   min: <.01% max: 41.55% x̄: 3.14% x̃: 1.33%
95% mean confidence interval for cycles value: -34.32 -17.97
95% mean confidence interval for cycles %-change: -4.55% -4.29%
Cycles are helped.

GM45 and Iron Lake had nearly identical results (Iron Lake shown)
total instructions in shared programs: 7886750 -> 7879944 (-0.09%)
instructions in affected programs: 373781 -> 366975 (-1.82%)
helped: 3715
HURT: 47
helped stats (abs) min: 1 max: 8 x̄: 1.86 x̃: 1
helped stats (rel) min: 0.22% max: 16.67% x̄: 2.88% x̃: 2.06%
HURT stats (abs)   min: 1 max: 6 x̄: 2.55 x̃: 2
HURT stats (rel)   min: 1.09% max: 5.00% x̄: 1.93% x̃: 2.35%
95% mean confidence interval for instructions value: -1.85 -1.77
95% mean confidence interval for instructions %-change: -2.91% -2.73%
Instructions are helped.

total cycles in shared programs: 178114636 -> 178095452 (-0.01%)
cycles in affected programs: 7227666 -> 7208482 (-0.27%)
helped: 3349
HURT: 301
helped stats (abs) min: 2 max: 90 x̄: 6.55 x̃: 4
helped stats (rel) min: <.01% max: 14.18% x̄: 0.95% x̃: 0.63%
HURT stats (abs)   min: 2 max: 42 x̄: 9.13 x̃: 10
HURT stats (rel)   min: 0.01% max: 11.19% x̄: 1.22% x̃: 1.50%
95% mean confidence interval for cycles value: -5.52 -4.99
95% mean confidence interval for cycles %-change: -0.81% -0.73%
Cycles are helped.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]
6 years agoi965: perf: consolidate unmapping oa perf bo outside accumulation
Lionel Landwerlin [Wed, 7 Mar 2018 14:10:15 +0000 (14:10 +0000)]
i965: perf: consolidate unmapping oa perf bo outside accumulation

Do this in one place outside the only caller of the accumulation
function.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoi965: perf: count number of accumlated reports
Lionel Landwerlin [Tue, 6 Mar 2018 17:11:56 +0000 (17:11 +0000)]
i965: perf: count number of accumlated reports

This will be reused later.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoi965: perf: reuse timescale base function from query
Lionel Landwerlin [Tue, 6 Mar 2018 15:47:00 +0000 (15:47 +0000)]
i965: perf: reuse timescale base function from query

We already have the same function in brw_queryobj.c

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoi965: perf: store sysfs device entry into context
Lionel Landwerlin [Wed, 7 Feb 2018 18:09:58 +0000 (18:09 +0000)]
i965: perf: store sysfs device entry into context

We want to reuse it later on.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoi965: perf: store the hw_id of the context in the query
Lionel Landwerlin [Wed, 7 Feb 2018 18:10:57 +0000 (18:10 +0000)]
i965: perf: store the hw_id of the context in the query

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoi965: perf: default case for unknown query types
Lionel Landwerlin [Tue, 6 Feb 2018 17:29:32 +0000 (17:29 +0000)]
i965: perf: default case for unknown query types

Just some extra safety before further changes.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoradeonsi: remove chip_class parameter from si_lower_nir
Marek Olšák [Tue, 6 Mar 2018 23:30:06 +0000 (18:30 -0500)]
radeonsi: remove chip_class parameter from si_lower_nir

We can get it from si_screen.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
6 years agowinsys/amdgpu: query GDS info
Marek Olšák [Sun, 11 Sep 2016 19:53:20 +0000 (21:53 +0200)]
winsys/amdgpu: query GDS info

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
6 years agowinsys/amdgpu: pad compute IBs
Marek Olšák [Tue, 6 Mar 2018 20:03:09 +0000 (15:03 -0500)]
winsys/amdgpu: pad compute IBs

v2: pad with PKT2 NOPs on SI

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
6 years agoradeonsi: expand constbuf 0 address correctly to fix Vega10 hangs
Marek Olšák [Wed, 7 Mar 2018 16:36:26 +0000 (11:36 -0500)]
radeonsi: expand constbuf 0 address correctly to fix Vega10 hangs

This is only required with the latest libdrm.

This fixes 32-bit support with high addresses.
(and possibly 64-bit support too because the high bits need to be masked out)

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
6 years agoradeonsi: align command buffer starting address to fix some Raven hangs
Marek Olšák [Wed, 7 Mar 2018 00:07:58 +0000 (19:07 -0500)]
radeonsi: align command buffer starting address to fix some Raven hangs

Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
6 years agoetnaviv: add get_driver_query_group_info(..)
Christian Gmeiner [Mon, 5 Mar 2018 22:26:43 +0000 (23:26 +0100)]
etnaviv: add get_driver_query_group_info(..)

This enables AMD_performance_monitor extension.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
6 years agoetnaviv: add query_group_info for sw counters
Christian Gmeiner [Mon, 5 Mar 2018 22:26:42 +0000 (23:26 +0100)]
etnaviv: add query_group_info for sw counters

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
6 years agomeson: Fix building gallium media libs without egl
Dylan Baker [Wed, 28 Feb 2018 21:07:57 +0000 (13:07 -0800)]
meson: Fix building gallium media libs without egl

v2: - rebase on omx fix

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net> (v1)
6 years agomeson: Allow building dri based EGL without GLX
Dylan Baker [Wed, 28 Feb 2018 18:13:38 +0000 (10:13 -0800)]
meson: Allow building dri based EGL without GLX

It should be possible to build EGL without GLX, but the meson build
currently doesn't allow that because it too tightly couples glx and dri.
This patch eases dri and glx apart, so that EGL without GLX can be
built.

CC: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
6 years agoglx/apple: Ship meson build file in tarball
Thierry Reding [Tue, 6 Mar 2018 09:44:08 +0000 (10:44 +0100)]
glx/apple: Ship meson build file in tarball

The meson build file for Apple GLX is not listed in the EXTRA_DIST make
variable and therefore isn't shipped as part of the release tarball, so
meson builds from the tarball will fail.

Add the file to EXTRA_DIST to ensure it is included in the tarball.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
6 years agoac/nir: do not emit unnecessary null exports in fragment shaders
Samuel Pitoiset [Thu, 8 Mar 2018 08:53:14 +0000 (09:53 +0100)]
ac/nir: do not emit unnecessary null exports in fragment shaders

Null exports should only be needed when no other exports are
emitted. This removes a bunch of 'exp null off, off, off, off done vm'.

Affected games are Dota 2 and Wolfenstein 2, not sure if that
really helps, but code size is decreasing there.

Polaris10:
Totals from affected shaders:
SGPRS: 8216 -> 8216 (0.00 %)
VGPRS: 7072 -> 7072 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 454968 -> 453896 (-0.24 %) bytes
Max Waves: 772 -> 772 (0.00 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agodrirc: whitespace fix
Eric Engestrom [Thu, 8 Mar 2018 09:52:16 +0000 (09:52 +0000)]
drirc: whitespace fix

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
6 years agodrirc: Disable the GLX_SGI_video_sync extension for gnome-shell on vmware
Thomas Hellstrom [Mon, 26 Feb 2018 13:32:01 +0000 (14:32 +0100)]
drirc: Disable the GLX_SGI_video_sync extension for gnome-shell on vmware

With this extension enabled and a server GLX implementation that actually
honors it, Window movement lags considerably on gnome-shell/vmware, so
disable it by default.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
6 years agogallium/st_dri: Honor the glx_disable_sgi_video_sync config option
Thomas Hellstrom [Mon, 26 Feb 2018 13:30:33 +0000 (14:30 +0100)]
gallium/st_dri: Honor the glx_disable_sgi_video_sync config option

This option is disabled by default. Primarily intended for drivers on
virtual hardware.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
6 years agoglx/dri: Add a driconf option to disable GLX_SGI_video_sync
Thomas Hellstrom [Mon, 26 Feb 2018 13:27:40 +0000 (14:27 +0100)]
glx/dri: Add a driconf option to disable GLX_SGI_video_sync

Drivers on virtual hardware don't want to expose this extension to
GLX compositors, similarly to GLX_OML_sync_control, since that significantly
increases latency.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
6 years agoac/radeonsi: add emit_kill to the abi
Timothy Arceri [Wed, 7 Mar 2018 22:46:42 +0000 (09:46 +1100)]
ac/radeonsi: add emit_kill to the abi

This should fix a regression with Rocket League grass rendering
on the NIR backend.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104717

6 years agoradeonsi: add si_llvm_emit_kill() helper
Timothy Arceri [Wed, 7 Mar 2018 22:37:10 +0000 (09:37 +1100)]
radeonsi: add si_llvm_emit_kill() helper

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agospirv: fix autotools builds
Timothy Arceri [Wed, 7 Mar 2018 23:37:52 +0000 (10:37 +1100)]
spirv: fix autotools builds

Fixes: 68a6a3b51acc "spirv: handle AMD_gcn_shader extended instructions"
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac: make use of if/loop build helpers
Timothy Arceri [Wed, 7 Mar 2018 00:10:54 +0000 (11:10 +1100)]
ac: make use of if/loop build helpers

These helpers insert the basic block in the same order as they
appear in NIR making it easier to follow LLVM IR dumps. The helpers
also insert more useful labels onto the blocks.

TGSI use the line number of the corresponding opcode in the TGSI
dump as the label id, here we use the corresponding block index
from NIR.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi: make use of if/loop build helpers in ac
Timothy Arceri [Tue, 6 Mar 2018 23:55:47 +0000 (10:55 +1100)]
radeonsi: make use of if/loop build helpers in ac

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoac: add if/loop build helpers
Timothy Arceri [Tue, 6 Mar 2018 23:53:34 +0000 (10:53 +1100)]
ac: add if/loop build helpers

These have been ported over from radeonsi.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradv: enable AMD_gcn_shader extension
Daniel Schürmann [Fri, 23 Feb 2018 12:55:01 +0000 (13:55 +0100)]
radv: enable AMD_gcn_shader extension

Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac: implement AMD_gcn_shader extended instructions
Daniel Schürmann [Fri, 23 Feb 2018 12:55:00 +0000 (13:55 +0100)]
ac: implement AMD_gcn_shader extended instructions

Co-authored-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agospirv: handle AMD_gcn_shader extended instructions
Daniel Schürmann [Fri, 23 Feb 2018 12:54:59 +0000 (13:54 +0100)]
spirv: handle AMD_gcn_shader extended instructions

Co-authored-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agonir: add AMD_gcn_shader extended instructions
Daniel Schürmann [Fri, 23 Feb 2018 12:54:58 +0000 (13:54 +0100)]
nir: add AMD_gcn_shader extended instructions

Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agospirv: import AMD extensions header from glslang
Daniel Schürmann [Fri, 23 Feb 2018 12:54:57 +0000 (13:54 +0100)]
spirv: import AMD extensions header from glslang

Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agomeson: Fix indent in omx meson.build
Dylan Baker [Tue, 6 Mar 2018 18:36:09 +0000 (10:36 -0800)]
meson: Fix indent in omx meson.build

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Julien Isorce <julien.isorce@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
6 years agomeson: Use include directory variables instead of traversing
Dylan Baker [Tue, 6 Mar 2018 18:36:42 +0000 (10:36 -0800)]
meson: Use include directory variables instead of traversing

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Julien Isorce <julien.isorce@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
6 years agomeson: Re-add auto option for omx
Dylan Baker [Tue, 6 Mar 2018 18:11:38 +0000 (10:11 -0800)]
meson: Re-add auto option for omx

This re-adds the auto option for omx, without it we default to tizonia
and the build fails almost immediately, this is especially obnoxious
those building a driver that doesn't support the OMX state tracker to
begin with.

v2: - Only define OMX_FOO for auto cases if the dependencies are found.
      This fixes building tizonia with auto (Julien, Eric)

CC: Gurkirpal Singh <gurkirpal204@gmail.com>
Fixes: bb5e27fab6087a5c1528a5faf507acce700e883c
       ("st/omx/bellagio: Rename st and target directories")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Julien Isorce <julien.isorce@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com> (v1)
6 years agomeson: fix tizonia compilation
Dylan Baker [Tue, 6 Mar 2018 19:33:16 +0000 (11:33 -0800)]
meson: fix tizonia compilation

It needs to have src/egl in it's includes as well.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Julien Isorce <julien.isorce@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
6 years agomeson: combine state trackers and target if blocks
Dylan Baker [Tue, 6 Mar 2018 19:32:23 +0000 (11:32 -0800)]
meson: combine state trackers and target if blocks

This is needed later since tizonia requires dri

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Julien Isorce <julien.isorce@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
6 years agost/mesa: expose 0 shader binary formats for compat profiles for Qt
Marek Olšák [Fri, 23 Feb 2018 19:42:41 +0000 (20:42 +0100)]
st/mesa: expose 0 shader binary formats for compat profiles for Qt

Bugzilla: https://bugreports.qt.io/browse/QTBUG-66420
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105065
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
6 years agodraw: fix line stippling with aa lines
Roland Scheidegger [Tue, 6 Mar 2018 20:33:16 +0000 (21:33 +0100)]
draw: fix line stippling with aa lines

In contrast to non-aa, where stippling is based on either dx or dy
(depending on if it's a x or y major line), stippling is based on
actual distance with smooth lines, so adjust for this.

(It looks like there's some minor artifacts with mesa demos
line-sample and stippling, it looks like the line endpoints
aren't quite right with aa + stippling - maybe due to the
integer math in the stipple stage, but I can't quite pinpoint it.)

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
6 years agodraw: simplify (and correct) aaline fallback (v2)
Roland Scheidegger [Tue, 6 Mar 2018 18:16:45 +0000 (19:16 +0100)]
draw: simplify (and correct) aaline fallback (v2)

The motivation actually was to get rid of the additional tex
instruction, since that requires the draw fallback code to intercept
all sampler / view calls (even if the fallback is never hit).
Basically, the idea is to use coverage of the pixel to calculate
the alpha value, and coverage is simply based on the distance
to the center of the line (in both line direction, which is useful
for wide lines, as well as perpendicular to the line).
This is much closer to what hw supporting this natively actually does.
It also fixes an issue with line width not quite being correct, as
well as endpoints getting stretched too far (in line direction) with
wide lines, which is apparent with mesa demo line-sample.
(For llvmpipe, it would probably make sense to do something like this
directly when drawing lines, since rendering two tris is twice as
expensive as a line, but it would need some changes with state
management.)
Since we're no longer relying on mipmapping to get the alpha value,
we also don't need to draw 3 rects (6 tris), one is sufficient.

There's still issues (as before):
- quite sure it's not correct without half_pixel_center, but can't test
this with GL.
- aaline + line stipple is incorrect (evident with line-sample demo).
Looking at the spec the stipple pattern should actually be based on
distance (not just dx or dy for x/y major lines as without aa).
- outputs (other than pos + the one used for line aa) should be
reinterpolated since we actually increase line length by half a pixel
(but there's no tests which would care).

v2: simplify the math (should be equivalent), don't need immediate
v3: use float versions of atan2,cos,sin, minor cleanups

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
6 years agoradv: Don't emit a warning on VI-GFX9.
Bas Nieuwenhuizen [Wed, 7 Mar 2018 15:38:32 +0000 (16:38 +0100)]
radv: Don't emit a warning on VI-GFX9.

We are conformant:

https://www.khronos.org/conformance/adopters/conformant-products#submission_308

v2: Actually not emit it on gfx9.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
6 years agoradv: Enable vulkan 1.1.0 for configurations that can support it.
Bas Nieuwenhuizen [Tue, 6 Feb 2018 00:40:00 +0000 (01:40 +0100)]
radv: Enable vulkan 1.1.0 for configurations that can support it.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: Disable sampler ycbcr conversion.
Bas Nieuwenhuizen [Mon, 22 Jan 2018 21:22:41 +0000 (22:22 +0100)]
radv: Disable sampler ycbcr conversion.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: Expose that we don't support any VK_KHR_16_bit_storage parts.
Bas Nieuwenhuizen [Sun, 21 Jan 2018 23:34:08 +0000 (00:34 +0100)]
radv: Expose that we don't support any VK_KHR_16_bit_storage parts.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: Implement vkEnumerateInstanceVersion.
Bas Nieuwenhuizen [Sun, 21 Jan 2018 21:34:11 +0000 (22:34 +0100)]
radv: Implement vkEnumerateInstanceVersion.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: Add trivial device group implementation.
Bas Nieuwenhuizen [Sun, 21 Jan 2018 16:13:26 +0000 (17:13 +0100)]
radv: Add trivial device group implementation.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: Implement vkCmdDispatchBase.
Bas Nieuwenhuizen [Sun, 21 Jan 2018 15:32:38 +0000 (16:32 +0100)]
radv: Implement vkCmdDispatchBase.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: Implement VkGetDeviceQueue2.
Bas Nieuwenhuizen [Sun, 21 Jan 2018 15:11:48 +0000 (16:11 +0100)]
radv: Implement VkGetDeviceQueue2.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: Support VkPhysicalDeviceProtectedMemoryFeatures.
Bas Nieuwenhuizen [Sun, 21 Jan 2018 14:59:45 +0000 (15:59 +0100)]
radv: Support VkPhysicalDeviceProtectedMemoryFeatures.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: Support VkPhysicalDeviceShaderDrawParameterFeatures.
Bas Nieuwenhuizen [Sun, 21 Jan 2018 14:57:59 +0000 (15:57 +0100)]
radv: Support VkPhysicalDeviceShaderDrawParameterFeatures.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: Implement VK_KHR_maintenance3.
Bas Nieuwenhuizen [Sun, 21 Jan 2018 14:53:03 +0000 (15:53 +0100)]
radv: Implement VK_KHR_maintenance3.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: Add minimal subgroup support.
Bas Nieuwenhuizen [Sun, 21 Jan 2018 14:06:10 +0000 (15:06 +0100)]
radv: Add minimal subgroup support.

Deliberately not implementing workgroup scopes as that is not needed
for core vulkan.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: Change client version check.
Bas Nieuwenhuizen [Sun, 21 Jan 2018 12:55:26 +0000 (13:55 +0100)]
radv: Change client version check.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: Update MAX_API_VERSION to 1.1.0
Bas Nieuwenhuizen [Sun, 21 Jan 2018 12:39:22 +0000 (13:39 +0100)]
radv: Update MAX_API_VERSION to 1.1.0

v2: Don't bump supported version.
v3: Update json files.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoac/nir: Add vote_ieq/vote_feq lowering pass.
Bas Nieuwenhuizen [Mon, 5 Feb 2018 21:54:18 +0000 (22:54 +0100)]
ac/nir: Add vote_ieq/vote_feq lowering pass.

The old vote_eq implementation supported only booleans, but now
we have to support arbitrary values, so use the read_first_invocation
intrinsic + ballot.

I took this as an opportunity to figure out how easy it was to do this
in nir instead of in the nir_to_llvm pass, and it actually turned out
pretty okay IMO. Only creating the pass is some extra code.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoanv: Support version overrides
Jason Ekstrand [Fri, 10 Nov 2017 03:17:29 +0000 (19:17 -0800)]
anv: Support version overrides

While always sketchy to do, this is useful for debugging.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agovulkan/util: Add a helper to get a version override
Jason Ekstrand [Fri, 10 Nov 2017 03:17:17 +0000 (19:17 -0800)]
vulkan/util: Add a helper to get a version override

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv: Enable Vulkan 1.1
Jason Ekstrand [Fri, 22 Sep 2017 14:44:10 +0000 (07:44 -0700)]
anv: Enable Vulkan 1.1

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agoanv: Add support for SPIR-V 1.3 subgroup operations
Jason Ekstrand [Fri, 28 Apr 2017 08:22:39 +0000 (01:22 -0700)]
anv: Add support for SPIR-V 1.3 subgroup operations

This requires us to bump the subgroup size to 32 for all shader stages
because Vulkan requires that to be a physical device query.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agointel/fs: Add support for subgroup quad operations
Jason Ekstrand [Fri, 1 Sep 2017 22:18:02 +0000 (15:18 -0700)]
intel/fs: Add support for subgroup quad operations

NIR has code to lower these away for us but we can do significantly
better in many cases with register regioning and SIMD4x2.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agointel/fs: Implement reduce and scan opeprations
Jason Ekstrand [Fri, 1 Sep 2017 05:12:48 +0000 (22:12 -0700)]
intel/fs: Implement reduce and scan opeprations

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agointel/fs: Add a helper for emitting scan operations
Jason Ekstrand [Fri, 1 Sep 2017 04:50:31 +0000 (21:50 -0700)]
intel/fs: Add a helper for emitting scan operations

This commit adds a helper to the builder for emitting "scan" operations.
Given a binary operation #, a scan takes the vector [a0, a1, ..., aN]
and returns the vector [a0, a0 # a1, ..., a0 # a1 # ... # aN] where each
channel contains the combination of all previous channels.  The sequence
of instructions to perform the scan is fairly optimal; a 16-wide scan on
a 32-bit type is only 6 instructions.  The subgroup scan and reduction
operations will be implemented in terms of this.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agointel/fs: Add a couple of simple helper opcodes
Jason Ekstrand [Fri, 1 Sep 2017 04:45:30 +0000 (21:45 -0700)]
intel/fs: Add a couple of simple helper opcodes

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agospirv: Add support for subgroup arithmetic
Jason Ekstrand [Wed, 30 Aug 2017 03:10:35 +0000 (20:10 -0700)]
spirv: Add support for subgroup arithmetic

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agonir: Add a helper for getting binop identities
Jason Ekstrand [Wed, 30 Aug 2017 03:36:55 +0000 (20:36 -0700)]
nir: Add a helper for getting binop identities

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agonir: Add subgroup arithmetic reduction intrinsics
Jason Ekstrand [Wed, 30 Aug 2017 03:09:58 +0000 (20:09 -0700)]
nir: Add subgroup arithmetic reduction intrinsics

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agospirv: Add subgroup quad support
Jason Ekstrand [Tue, 29 Aug 2017 17:21:31 +0000 (10:21 -0700)]
spirv: Add subgroup quad support

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agonir: Add quad operations and lowering
Jason Ekstrand [Tue, 29 Aug 2017 17:20:56 +0000 (10:20 -0700)]
nir: Add quad operations and lowering

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agoi965/fs: Add support for nir_intrinsic_shuffle
Jason Ekstrand [Tue, 29 Aug 2017 16:21:32 +0000 (09:21 -0700)]
i965/fs: Add support for nir_intrinsic_shuffle

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agospirv: Add subgroup shuffle support
Jason Ekstrand [Tue, 29 Aug 2017 16:44:44 +0000 (09:44 -0700)]
spirv: Add subgroup shuffle support

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agonir: Add subgroup shuffle intrinsics and lowering
Jason Ekstrand [Thu, 7 Dec 2017 05:41:47 +0000 (21:41 -0800)]
nir: Add subgroup shuffle intrinsics and lowering

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agoi965/fs: Support nir_intrinsic_vote_feq
Jason Ekstrand [Tue, 29 Aug 2017 00:38:53 +0000 (17:38 -0700)]
i965/fs: Support nir_intrinsic_vote_feq

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agonir/lower_subgroups: Add scalarizing for vote_eq
Jason Ekstrand [Tue, 29 Aug 2017 02:55:34 +0000 (19:55 -0700)]
nir/lower_subgroups: Add scalarizing for vote_eq

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
6 years agospirv: Add subgroup vote support
Jason Ekstrand [Thu, 24 Aug 2017 18:01:22 +0000 (11:01 -0700)]
spirv: Add subgroup vote support

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
6 years agonir: Generalize nir_intrinsic_vote_eq
Jason Ekstrand [Tue, 29 Aug 2017 00:33:33 +0000 (17:33 -0700)]
nir: Generalize nir_intrinsic_vote_eq

The SPIR-V extension wants us to be able to do an AllEqual on any vector
or scalar type.  This has two implications:

 1) We need to be able to handle vectors so we switch the vote_eq
    intrinsics to be vectorized intrinsics.

 2) We need to handle floats which have different behavior with respect
    to +-0, NaN, etc. than the integer variant so we need two variants.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
6 years agospirv: Add subgroup ballot support
Jason Ekstrand [Tue, 22 Aug 2017 23:53:05 +0000 (16:53 -0700)]
spirv: Add subgroup ballot support

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agoi965/fs: Implement basic SPIR-V subgroup intrinsics
Jason Ekstrand [Tue, 22 Aug 2017 05:17:37 +0000 (22:17 -0700)]
i965/fs: Implement basic SPIR-V subgroup intrinsics

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agospirv: Add initial subgroup support
Jason Ekstrand [Fri, 28 Apr 2017 11:45:50 +0000 (04:45 -0700)]
spirv: Add initial subgroup support

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agonir: Add new SPIR-V ballot intrinsics and lowering
Jason Ekstrand [Tue, 9 May 2017 23:44:13 +0000 (16:44 -0700)]
nir: Add new SPIR-V ballot intrinsics and lowering

Someone can make the lowering optional later if they want something
different for their hardware.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agocompiler: Add two new system values for subgroups
Jason Ekstrand [Sat, 30 Sep 2017 21:50:40 +0000 (14:50 -0700)]
compiler: Add two new system values for subgroups

This will be required for SPIR-V subgroup support

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agonir: Add new SPIR-V ballot ALU intrinsics and lowering
Jason Ekstrand [Tue, 3 Oct 2017 01:19:44 +0000 (18:19 -0700)]
nir: Add new SPIR-V ballot ALU intrinsics and lowering

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agospirv: Handle the new OpModuleProcessed instruction
Jason Ekstrand [Wed, 11 Oct 2017 23:29:28 +0000 (16:29 -0700)]
spirv: Handle the new OpModuleProcessed instruction

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv: Stop returning VK_ERROR_INCOMPATIBLE_DRIVER
Jason Ekstrand [Wed, 11 Oct 2017 23:06:13 +0000 (16:06 -0700)]
anv: Stop returning VK_ERROR_INCOMPATIBLE_DRIVER

From the Vulkan 1.1 spec:

    "Vulkan 1.0 implementations were required to return
    VK_ERROR_INCOMPATIBLE_DRIVER if apiVersion was larger than 1.0.
    Implementations that support Vulkan 1.1 or later must not return
    VK_ERROR_INCOMPATIBLE_DRIVER for any value of apiVersion."

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv: Implement vkEnumerateInstanceVersion
Jason Ekstrand [Thu, 12 Oct 2017 01:09:32 +0000 (18:09 -0700)]
anv: Implement vkEnumerateInstanceVersion

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv/device: fail to initialize device if we have queues with unsupported flags
Iago Toral Quiroga [Tue, 6 Feb 2018 09:37:16 +0000 (10:37 +0100)]
anv/device: fail to initialize device if we have queues with unsupported flags

This is not strictly necessary since users should not be requesting any
flags that are not valid for the list of enabled features requested and
we already fail if they attempt to use an unsupported feature, however
it is an easy to implement sanity check that would help developes realize
that they are doing things wrong, so we might as well do it.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoanv/device: GetDeviceQueue2 should only return queues with matching flags
Iago Toral Quiroga [Tue, 6 Feb 2018 09:06:30 +0000 (10:06 +0100)]
anv/device: GetDeviceQueue2 should only return queues with matching flags

From the Vulkan 1.1 spec, VkDeviceQueueInfo2 structure:

   "The queue returned by vkGetDeviceQueue2 must have the same flags value
    from this structure as that used at device creation time in a
    VkDeviceQueueCreateInfo instance. If no matching flags were specified
    at device creation time then pQueue will return VK_NULL_HANDLE."

For us this means no flags at all since we don't support any.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoanv: Support querying for protected memory
Jason Ekstrand [Fri, 22 Sep 2017 17:03:18 +0000 (10:03 -0700)]
anv: Support querying for protected memory

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
6 years agoanv: Implement GetDeviceQueue2
Jason Ekstrand [Fri, 6 Oct 2017 02:29:27 +0000 (19:29 -0700)]
anv: Implement GetDeviceQueue2

This belongs to the protected memory feature but there's nothing about
it that's specific to protected memory.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
6 years agoanv: Trivially implement VK_KHR_device_group
Jason Ekstrand [Thu, 21 Sep 2017 20:54:55 +0000 (13:54 -0700)]
anv: Trivially implement VK_KHR_device_group

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agoanv: Implement vkCmdDispatchBase
Jason Ekstrand [Tue, 3 Oct 2017 22:23:07 +0000 (15:23 -0700)]
anv: Implement vkCmdDispatchBase

This is part of the device groups extension/feature but it's a decent
chunk of work in its own right so it's worth breaking into its own
patch.  The mechanism we use is fairly straightforward: we just push the
base work group id into the shader and add it to the work group id we
get from dispatch.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agonir/spirv: Add support for device groups
Jason Ekstrand [Thu, 21 Sep 2017 22:51:55 +0000 (15:51 -0700)]
nir/spirv: Add support for device groups

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv: Implement VK_KHR_maintenance3
Jason Ekstrand [Thu, 5 Oct 2017 23:03:29 +0000 (16:03 -0700)]
anv: Implement VK_KHR_maintenance3

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv: Support VkPhysicalDeviceShaderDrawParameterFeatures
Jason Ekstrand [Fri, 13 Oct 2017 18:03:07 +0000 (11:03 -0700)]
anv: Support VkPhysicalDeviceShaderDrawParameterFeatures

This advertises the VK_KHR_shader_draw_parameters functionality as a
"core optimal feature" in Vulkan 1.1.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv/entrypoints: Drop support for protect attributes
Jason Ekstrand [Tue, 17 Oct 2017 04:48:11 +0000 (21:48 -0700)]
anv/entrypoints: Drop support for protect attributes

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoGet rid of a bunch of KHR suffixes
Jason Ekstrand [Wed, 20 Sep 2017 20:16:26 +0000 (13:16 -0700)]
Get rid of a bunch of KHR suffixes

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv: Add version 1.1.0 but leave it disabled
Jason Ekstrand [Wed, 20 Sep 2017 19:18:10 +0000 (12:18 -0700)]
anv: Add version 1.1.0 but leave it disabled

This requires us to rename any Vulkan API entrypoints which became core
in 1.1 to no longer have the KHR suffix.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agospirv: Update the SPIR-V headers and json to 1.3.1
Jason Ekstrand [Mon, 21 Aug 2017 23:15:36 +0000 (16:15 -0700)]
spirv: Update the SPIR-V headers and json to 1.3.1

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agovulkan: Update the XML and headers to 1.1.70
Jason Ekstrand [Tue, 19 Sep 2017 20:04:13 +0000 (13:04 -0700)]
vulkan: Update the XML and headers to 1.1.70

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agovulkan/enum_to_str: Add support for aliases and new Vulkan versions
Jason Ekstrand [Thu, 21 Sep 2017 15:26:06 +0000 (08:26 -0700)]
vulkan/enum_to_str: Add support for aliases and new Vulkan versions

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>