mesa.git
8 years agonvc0/ir: simplify predicate logic for GK104 atomic operations
Samuel Pitoiset [Wed, 19 Oct 2016 11:09:49 +0000 (13:09 +0200)]
nvc0/ir: simplify predicate logic for GK104 atomic operations

The predicate is always CC_NOT_P as defined in
processSurfaceCoordsNVE4(), so we only want to emit OR.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agonvc0/ir: remove useless NVC0LoweringPass::gMemBase
Samuel Pitoiset [Wed, 19 Oct 2016 11:02:02 +0000 (13:02 +0200)]
nvc0/ir: remove useless NVC0LoweringPass::gMemBase

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
8 years agonv50/ir: print CCTL subops in debug mode
Samuel Pitoiset [Wed, 19 Oct 2016 12:01:33 +0000 (14:01 +0200)]
nv50/ir: print CCTL subops in debug mode

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agonir: Optimize integer division and modulus with 1
Ian Romanick [Wed, 19 Oct 2016 15:53:10 +0000 (08:53 -0700)]
nir: Optimize integer division and modulus with 1

The previous power-of-two rules didn't catch idiv (because i965 doesn't
set lower_idiv) and imod cases.  The udiv and umod cases should have
been caught, but I included them for orthogonality.

This fixes silly code observed from compute shaders with local_size_[xy]
= 1.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98299
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoconfigure.ac: enable EGL platform DRM if GBM is enabled
Marek Olšák [Tue, 18 Oct 2016 21:20:29 +0000 (23:20 +0200)]
configure.ac: enable EGL platform DRM if GBM is enabled

since GBM is enabled by default, this is also enabled by default

the whitespace changes remove tabs

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
8 years agoconfigure.ac: enable GBM by default
Marek Olšák [Tue, 18 Oct 2016 21:19:58 +0000 (23:19 +0200)]
configure.ac: enable GBM by default

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
8 years agoconfigure.ac: print whether GBM is enabled
Marek Olšák [Tue, 18 Oct 2016 21:18:28 +0000 (23:18 +0200)]
configure.ac: print whether GBM is enabled

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
8 years agoradeonsi: eliminate trivial constant VS outputs
Marek Olšák [Tue, 18 Oct 2016 13:20:22 +0000 (15:20 +0200)]
radeonsi: eliminate trivial constant VS outputs

These constant value VS PARAM exports:
- 0,0,0,0
- 0,0,0,1
- 1,1,1,0
- 1,1,1,1
can be loaded into PS inputs using the DEFAULT_VAL field, and the VS exports
can be removed from the IR to save export & parameter memory.

After LLVM optimizations, analyze the IR to see which exports are equal to
the ones listed above (or undef) and remove them if they are.

Targeted use cases:
- All DX9 eON ports always clear 10 VS outputs to 0.0 even if most of them
  are unused by PS (such as Witcher 2 below).
- VS output arrays with unused elements that the GLSL compiler can't
  eliminate (such as Batman below).

The shader-db deltas are quite interesting:
(not from upstream si-report.py, it won't be upstreamed)

PERCENTAGE DELTAS    Shaders PARAM exports (affected only)
batman_arkham_origins    589  -67.17 %
bioshock-infinite       1769   -0.47 %
dirt-showdown            548   -2.68 %
dota2                   1747   -3.36 %
f1-2015                  776   -4.94 %
left_4_dead_2           1762   -0.07 %
metro_2033_redux        2670   -0.43 %
portal                   474   -0.22 %
talos_principle          324   -3.63 %
warsow                   176   -2.20 %
witcher2                1040  -73.78 %
----------------------------------------
All affected             991  -65.37 %  ... 9681 -> 3353
----------------------------------------
Total                  26725  -10.82 %  ... 58490 -> 52162

v2: treat Undef as both 0 and 1

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> (v1)
8 years agonv50/ir: silent TGSI_PROPERTY_FS_DEPTH_LAYOUT
Samuel Pitoiset [Tue, 18 Oct 2016 17:59:27 +0000 (19:59 +0200)]
nv50/ir: silent TGSI_PROPERTY_FS_DEPTH_LAYOUT

Found that information message while replaying a trace from
Metro 2033 Redux. Mark that property as useless for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agodocs: add 13.1.0-devel release notes template, bump version
Emil Velikov [Wed, 19 Oct 2016 17:46:22 +0000 (18:46 +0100)]
docs: add 13.1.0-devel release notes template, bump version

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
8 years agodocs: rename release notes to 13.0.0
Emil Velikov [Wed, 19 Oct 2016 16:33:38 +0000 (17:33 +0100)]
docs: rename release notes to 13.0.0

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
8 years agoradeonsi: remove cb0_is_integer handling
Marek Olšák [Fri, 16 Sep 2016 20:42:54 +0000 (22:42 +0200)]
radeonsi: remove cb0_is_integer handling

st/mesa does this for us.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agost/mesa: disable alpha-test, alpha-to-coverage, alpha-to-one for integer FBs
Marek Olšák [Fri, 16 Sep 2016 20:39:15 +0000 (22:39 +0200)]
st/mesa: disable alpha-test, alpha-to-coverage, alpha-to-one for integer FBs

v2: rebased

Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agomesa: remove gl_shader_compiler_options::EmitNoNoise
Marek Olšák [Sun, 16 Oct 2016 22:54:35 +0000 (00:54 +0200)]
mesa: remove gl_shader_compiler_options::EmitNoNoise

it's always true

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoglsl_to_tgsi: remove code for fixing up TGSI labels
Marek Olšák [Sun, 16 Oct 2016 22:47:49 +0000 (00:47 +0200)]
glsl_to_tgsi: remove code for fixing up TGSI labels

I don't know what this was supposed to do, but all TGSI labels were
always 0.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoglsl_to_tgsi: remove subroutine support
Marek Olšák [Sun, 16 Oct 2016 22:38:41 +0000 (00:38 +0200)]
glsl_to_tgsi: remove subroutine support

Never used. The GLSL compiler doesn't even look at EmitNoFunctions.

v2: add back "return" support in "main"

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agomesa_to_tgsi: remove remnants of flow control and subroutine support
Marek Olšák [Sun, 16 Oct 2016 22:11:21 +0000 (00:11 +0200)]
mesa_to_tgsi: remove remnants of flow control and subroutine support

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agomesa_to_tgsi: drop support for instructions that can't occur here
Marek Olšák [Sun, 16 Oct 2016 22:07:01 +0000 (00:07 +0200)]
mesa_to_tgsi: drop support for instructions that can't occur here

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoglsl_to_tgsi: allocate glsl_to_tgsi_instruction::tex_offsets on demand
Marek Olšák [Sun, 16 Oct 2016 20:08:03 +0000 (22:08 +0200)]
glsl_to_tgsi: allocate glsl_to_tgsi_instruction::tex_offsets on demand

sizeof(glsl_to_tgsi_instruction): 384 -> 264

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoglsl_to_tgsi: merge buffer and sampler fields in glsl_to_tgsi_instruction
Marek Olšák [Sun, 16 Oct 2016 20:04:02 +0000 (22:04 +0200)]
glsl_to_tgsi: merge buffer and sampler fields in glsl_to_tgsi_instruction

sizeof(glsl_to_tgsi_instruction): 416 -> 384

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoglsl_to_tgsi: reduce the size of glsl_to_tgsi_instruction using bitfields
Marek Olšák [Sun, 16 Oct 2016 19:58:13 +0000 (21:58 +0200)]
glsl_to_tgsi: reduce the size of glsl_to_tgsi_instruction using bitfields

sizeof(glsl_to_tgsi_instruction): 464 -> 416

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoglsl_to_tgsi: reduce the size of st_dst_reg and st_src_reg
Marek Olšák [Sun, 16 Oct 2016 19:30:05 +0000 (21:30 +0200)]
glsl_to_tgsi: reduce the size of st_dst_reg and st_src_reg

I noticed that glsl_to_tgsi_instruction is too huge.

sizeof(glsl_to_tgsi_instruction): 752 -> 464 (-38%)

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoglsl_to_tgsi: remove unused st_translate::tex_offsets
Marek Olšák [Sun, 16 Oct 2016 19:28:36 +0000 (21:28 +0200)]
glsl_to_tgsi: remove unused st_translate::tex_offsets

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoglsl_to_tgsi: remove unused parameters from calc_deref_offsets
Marek Olšák [Sun, 16 Oct 2016 19:22:11 +0000 (21:22 +0200)]
glsl_to_tgsi: remove unused parameters from calc_deref_offsets

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoglsl_to_tgsi: use array_id for temp arrays instead of hacking high bits
Marek Olšák [Sun, 16 Oct 2016 21:22:55 +0000 (23:22 +0200)]
glsl_to_tgsi: use array_id for temp arrays instead of hacking high bits

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoreviewers: Throw myself on the GLX grenade
Adam Jackson [Thu, 6 Oct 2016 19:37:54 +0000 (15:37 -0400)]
reviewers: Throw myself on the GLX grenade

Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agoegl: bring back the default glapi.so name
Eric Engestrom [Wed, 19 Oct 2016 14:09:26 +0000 (15:09 +0100)]
egl: bring back the default glapi.so name

Earlier commit replaced the default platform specific libglapi.so name
with an #error.

This may have been overzealous since the name is the correct for the BSD
platforms, at least. Reinstate the hunk - bringing back OpenBSD, et al.
to a successful build state.

Fixes: 7a9c92d071d ("egl/dri2: non-shared glapi cleanups")
[Emil Velikov: format the patch from Eric, add commit message and tag.]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
8 years agoi965: fix subnr overflow in suboffset()
Iago Toral Quiroga [Tue, 27 Sep 2016 10:23:44 +0000 (12:23 +0200)]
i965: fix subnr overflow in suboffset()

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agoradv: decompress fmask before reading using texture unit
Dave Airlie [Wed, 19 Oct 2016 07:34:28 +0000 (17:34 +1000)]
radv: decompress fmask before reading using texture unit

Before we can read the fmask using the compute shader, we need
to decompress the fmask in place.

This fixes a bunch of remaining failure and hopefully multisampling
in Talos.

8 years agoradv: fix samples_identical return value.
Dave Airlie [Wed, 19 Oct 2016 05:43:26 +0000 (15:43 +1000)]
radv: fix samples_identical return value.

This was returning an inversion, so not doing as it should have.

We need to compare the fmask value with 0, and return the result
from that.

8 years agoradv: fix wsi porting regression in swapchain destroy.
Dave Airlie [Wed, 19 Oct 2016 03:53:55 +0000 (13:53 +1000)]
radv: fix wsi porting regression in swapchain destroy.

The code in anv is right, there's a pending patch to fix this up
different, but I'll sync the code for now.

8 years agoradv: fix fmask ptr issue
Dave Airlie [Wed, 19 Oct 2016 02:27:04 +0000 (12:27 +1000)]
radv: fix fmask ptr issue

We were using the wrong descriptor in the fmask picking code.

8 years agoradv: simplify fast clear shaders
Dave Airlie [Tue, 18 Oct 2016 03:20:11 +0000 (13:20 +1000)]
radv: simplify fast clear shaders

There is no need for anything but a noop shader here.

8 years agovulkan/wsi: fix out of tree build.
Dave Airlie [Wed, 19 Oct 2016 00:53:51 +0000 (10:53 +1000)]
vulkan/wsi: fix out of tree build.

8 years agoradv: start using defines for the user sgpr offsets
Dave Airlie [Mon, 10 Oct 2016 02:20:36 +0000 (03:20 +0100)]
radv: start using defines for the user sgpr offsets

This adds some comments and adds defines for the user sgprs,
so that we can move them around easier later and not have
to change/revalidate every one of these.

Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoradv: port to common wsi codebase
Dave Airlie [Fri, 14 Oct 2016 06:49:34 +0000 (07:49 +0100)]
radv: port to common wsi codebase

This drops all the radv WSI code in favour of using
the new shared code that was ported from anv

This regresses Talos for now, Jason has pointed out
the bug is in Talos and we should wait for them to fix it.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoanv: move to using shared wsi code
Dave Airlie [Fri, 14 Oct 2016 06:12:33 +0000 (07:12 +0100)]
anv: move to using shared wsi code

This moves the shared code to a common subdirectory
and makes anv linked to that code instead of the copy
it was using.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoanv/wsi: remove all anv references from WSI common code
Dave Airlie [Fri, 14 Oct 2016 05:36:17 +0000 (06:36 +0100)]
anv/wsi: remove all anv references from WSI common code

the WSI code should be now be clean for sharing.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoanv: move common wsi code to x11/wayland common files.
Dave Airlie [Fri, 14 Oct 2016 04:42:29 +0000 (05:42 +0100)]
anv: move common wsi code to x11/wayland common files.

Next task is to rename all the anv_ out of this,
and move to a common location

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoanv/wsi/wayland: add callback to get device format properties.
Dave Airlie [Fri, 14 Oct 2016 04:14:45 +0000 (05:14 +0100)]
anv/wsi/wayland: add callback to get device format properties.

This avoids having to know the toplevel API name.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoanv/wsi/wl: stop using device in more places
Dave Airlie [Fri, 14 Oct 2016 02:09:02 +0000 (03:09 +0100)]
anv/wsi/wl: stop using device in more places

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoanv/wsi: split out surface creation to avoid instance API
Dave Airlie [Fri, 14 Oct 2016 01:51:36 +0000 (02:51 +0100)]
anv/wsi: split out surface creation to avoid instance API

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoanv/wsi: move further away from passing anv displays around
Dave Airlie [Fri, 14 Oct 2016 01:38:49 +0000 (02:38 +0100)]
anv/wsi: move further away from passing anv displays around

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoanv/wsi: split image alloc/free out to separate fns.
Dave Airlie [Fri, 14 Oct 2016 00:34:10 +0000 (01:34 +0100)]
anv/wsi: split image alloc/free out to separate fns.

This moves these outside the wsi platform code, so we can reuse
that code

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoanv/wsi: switch to using VkDevice in swapchain
Dave Airlie [Thu, 13 Oct 2016 23:42:56 +0000 (00:42 +0100)]
anv/wsi: switch to using VkDevice in swapchain

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoanv/wsi/x11: more refactoring to use generic handles
Dave Airlie [Thu, 13 Oct 2016 23:35:12 +0000 (00:35 +0100)]
anv/wsi/x11: more refactoring to use generic handles

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoanv/wsi/x11: start refactoring out the image allocation/free functionality
Dave Airlie [Thu, 13 Oct 2016 23:21:17 +0000 (00:21 +0100)]
anv/wsi/x11: start refactoring out the image allocation/free functionality

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoanv/wsi: drop device from get format
Dave Airlie [Thu, 13 Oct 2016 04:32:41 +0000 (05:32 +0100)]
anv/wsi: drop device from get format

Just use the wsi_device instead.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoanv/wsi: remove device from get_support interface
Dave Airlie [Thu, 13 Oct 2016 04:26:03 +0000 (05:26 +0100)]
anv/wsi: remove device from get_support interface

replace with wsi_device and allocator.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoanv/wsi/x11: abstract WSI interface from internals.
Dave Airlie [Thu, 13 Oct 2016 04:25:33 +0000 (05:25 +0100)]
anv/wsi/x11: abstract WSI interface from internals.

This allows the API and the internals to be split, and the
internals shared.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoanv/wsi/x11: push anv_device out of the init/finish routines
Dave Airlie [Thu, 13 Oct 2016 04:18:34 +0000 (05:18 +0100)]
anv/wsi/x11: push anv_device out of the init/finish routines

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoanv/wsi: abstract wsi interfaces away from device a bit more.
Dave Airlie [Thu, 13 Oct 2016 04:14:52 +0000 (05:14 +0100)]
anv/wsi: abstract wsi interfaces away from device a bit more.

This is a step towards separating out the wsi code for sharing

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoanv/wsi/x11: push device out of x11 connection fns.
Dave Airlie [Thu, 13 Oct 2016 04:07:27 +0000 (05:07 +0100)]
anv/wsi/x11: push device out of x11 connection fns.

just pass the allocator/wsi_interface instead.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoanv/wsi: drop device from get caps
Dave Airlie [Thu, 13 Oct 2016 04:27:56 +0000 (05:27 +0100)]
anv/wsi: drop device from get caps

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoanv/wsi: drop get present modes device arg
Dave Airlie [Thu, 13 Oct 2016 04:33:28 +0000 (05:33 +0100)]
anv/wsi: drop get present modes device arg

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoradv/anv/wsi: drop unneeded parameter
Dave Airlie [Thu, 13 Oct 2016 03:43:27 +0000 (04:43 +0100)]
radv/anv/wsi: drop unneeded parameter

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agodraw: improve vertex fetch (v2)
Roland Scheidegger [Sat, 15 Oct 2016 01:53:48 +0000 (03:53 +0200)]
draw: improve vertex fetch (v2)

The per-element fetch has quite some calculations which are constant,
these can be moved outside both the per-element as well as the main
shader loop (llvm can figure out it's constant mostly on its own, however
this can have a significant compile time cost).
Similarly, it looks easier swapping the fetch loops (outer loop per attrib,
inner loop filling up the per vertex elements - this way the aos->soa
conversion also can be done per attrib and not just at the end though again
this doesn't really make much of a difference in the generated code). (This
would also make it possible to vectorize the calculations leading to the
fetches.)
There's also some minimal change simplifying the overflow math slightly.
All in all, the generated code seems to look slightly simpler (depending
on the actual vs), but more importantly I've seen a significant reduction
in compile times for some vs (albeit with old (3.3) llvm version, and the
time reduction is only really for the optimizations run on the IR).
v2: adapt to other draw change.

No changes with piglit.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
8 years agodraw: improved handling of undefined inputs
Roland Scheidegger [Fri, 14 Oct 2016 01:08:00 +0000 (03:08 +0200)]
draw: improved handling of undefined inputs

Previous attempts to zero initialize all inputs were not really optimal
(though no performance impact was measurable). In fact this is not really
necessary, since we know the max number of inputs used.
Instead, just generate fetch for up to max inputs used by the shader,
directly replacing inputs for which there was no vertex element by zero.
This also cleans up key generation, which previously would have stored
some garbage for these elements.
And also drop the assertion which indicates such bogus usage by a
debug_printf (the whole point of initializing the undefined inputs was to
make this case safe to handle).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
8 years agogallivm: print out time for jitting functions with GALLIVM_DEBUG=perf
Roland Scheidegger [Fri, 14 Oct 2016 03:37:34 +0000 (05:37 +0200)]
gallivm: print out time for jitting functions with GALLIVM_DEBUG=perf

Compilation to actual machine code can easily take as much time as the
optimization passes on the IR if not more, so print this out too.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
8 years agogallivm: Use native packs and unpacks for the lerps
Roland Scheidegger [Tue, 18 Oct 2016 01:37:37 +0000 (03:37 +0200)]
gallivm: Use native packs and unpacks for the lerps

For the texturing packs, things looked pretty terrible. For every
lerp, we were repacking the values, and while those look sort of cheap
with 128bit, with 256bit we end up with 2 of them instead of just 1 but
worse, plus 2 extracts too (the unpack, however, works fine with a
single instruction, albeit only with llvm 3.8 - the vpmovzxbw).

Ideally we'd use more clever pack for llvmpipe backend conversion too
since we actually use the "wrong" shuffle (which is more work) when doing
the fs twiddle just so we end up with the wrong order for being able to
do native pack when converting from 2x8f -> 1x16b. But this requires some
refactoring, since the untwiddle is separate from conversion.

This is only used for avx2 256bit pack/unpack for now.

Improves openarena scores by 8% or so, though overall it's still pretty
disappointing how much faster 256bit vectors are even with avx2 (or
rather, aren't...). And, of course, eliminating the needless
packs/unpacks in the first place would eliminate most of that advantage
(not quite all) from this patch.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
8 years agoanv: drop pointless struct decl.
Dave Airlie [Fri, 14 Oct 2016 03:42:01 +0000 (13:42 +1000)]
anv: drop pointless struct decl.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoradv: drop pointless struct decl.
Dave Airlie [Fri, 14 Oct 2016 03:41:47 +0000 (13:41 +1000)]
radv: drop pointless struct decl.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoradv: move to using shared vk_alloc inlines.
Dave Airlie [Fri, 14 Oct 2016 03:36:45 +0000 (13:36 +1000)]
radv: move to using shared vk_alloc inlines.

This moves to the shared vk_alloc inlines for vulkan
memory allocations.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoanv: move to using vk_alloc helpers.
Dave Airlie [Fri, 14 Oct 2016 03:31:35 +0000 (13:31 +1000)]
anv: move to using vk_alloc helpers.

This moves all the alloc/free in anv to the generic helpers.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agovulkan: add vk_alloc.h shared allocation inlines.
Dave Airlie [Fri, 14 Oct 2016 03:19:43 +0000 (13:19 +1000)]
vulkan: add vk_alloc.h shared allocation inlines.

vulkan allocation allows for overriding the allocator used,
add some macros for anv/radv to share for this.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoanv: drop local MIN/MAX macros.
Dave Airlie [Fri, 14 Oct 2016 03:12:08 +0000 (13:12 +1000)]
anv: drop local MIN/MAX macros.

Use the ones from mesa, most places already did.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoradv: drop local MIN/MAX macros.
Dave Airlie [Fri, 14 Oct 2016 03:11:20 +0000 (13:11 +1000)]
radv: drop local MIN/MAX macros.

Use the ones in macros.h instead.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoutil: move min/max/clamp macros to util macros.h
Dave Airlie [Fri, 14 Oct 2016 03:10:26 +0000 (13:10 +1000)]
util: move min/max/clamp macros to util macros.h

Although the vulkan drivers include mesa macros.h, for
radv I'd like to move away from that.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoradv: make use of shared vector helper.
Dave Airlie [Fri, 14 Oct 2016 02:59:55 +0000 (12:59 +1000)]
radv: make use of shared vector helper.

This removes the vector code from radv in favour of sharing
code with anv.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoanv: port to using new u_vector shared helper.
Dave Airlie [Fri, 14 Oct 2016 02:57:44 +0000 (12:57 +1000)]
anv: port to using new u_vector shared helper.

This just removes the anv vector code and uses the new helper.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoutil: add vector util code.
Dave Airlie [Fri, 14 Oct 2016 02:55:03 +0000 (12:55 +1000)]
util: add vector util code.

This is ported from anv, both anv and radv can share this.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agosvga: minor code improvements in svga_validate_pipe_sampler_view()
Brian Paul [Tue, 18 Oct 2016 16:20:55 +0000 (10:20 -0600)]
svga: minor code improvements in svga_validate_pipe_sampler_view()

Use the 'texture' local var in more places.
Rename 'pFormat' to 'viewFormat'.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
8 years agointel: genxml: add SAMPLER_BORDER_COLOR_STATE structures
Lionel Landwerlin [Wed, 12 Oct 2016 19:04:26 +0000 (20:04 +0100)]
intel: genxml: add SAMPLER_BORDER_COLOR_STATE structures

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agost/va: force to flush the last p frame in idr period
Boyuan Zhang [Mon, 17 Oct 2016 20:11:48 +0000 (16:11 -0400)]
st/va: force to flush the last p frame in idr period

During dual instance encoding submission, if the second encode task and first
encode task have no reference dependency, e.g. p following with idr-frame,
there is a chance the second task will use for its reconstructed picture
buffer the same buffer used by first task for its reference/reconstructed
picture. In this case, buffer corruption may occur depending on encoding
speed. Fix is to force flush these two tasks separately to avoid race condition

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=98005
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
8 years agoegl/surfaceless: Fix segfault in eglSwapBuffers
Chad Versace [Tue, 18 Oct 2016 16:39:49 +0000 (09:39 -0700)]
egl/surfaceless: Fix segfault in eglSwapBuffers

Since commit 63c5d5c6c46c8472ee7a8241a0f80f13d79cb8cd, the surfaceless
platform has allowed creation of pbuffer surfaces. But the vtable entry
for eglSwapBuffers has remained NULL.

Discovered by running a little pbuffer test.

Cc: Gurchetan Singh <gurchetansingh@chromium.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
8 years agoradeonsi: rename prefixes from radeon to si
Marek Olšák [Mon, 17 Oct 2016 10:51:27 +0000 (12:51 +0200)]
radeonsi: rename prefixes from radeon to si

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
8 years agoradeonsi: merge radeon_llvm_context and si_shader_context
Marek Olšák [Mon, 17 Oct 2016 10:42:12 +0000 (12:42 +0200)]
radeonsi: merge radeon_llvm_context and si_shader_context

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
8 years agoradeonsi: import all TGSI->LLVM code from gallium/radeon
Marek Olšák [Mon, 17 Oct 2016 10:30:42 +0000 (12:30 +0200)]
radeonsi: import all TGSI->LLVM code from gallium/radeon

Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
8 years agogallium/radeon: simplify initialization of 64-bit gallivm builders
Marek Olšák [Sun, 16 Oct 2016 23:51:53 +0000 (01:51 +0200)]
gallium/radeon: simplify initialization of 64-bit gallivm builders

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
8 years agogallium/radeon: remove unused radeon_llvm_reg_index_soa
Marek Olšák [Sun, 16 Oct 2016 23:39:21 +0000 (01:39 +0200)]
gallium/radeon: remove unused radeon_llvm_reg_index_soa

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
8 years agoradeonsi: move LLVM ALU codegen into radeonsi
Marek Olšák [Sun, 16 Oct 2016 23:36:58 +0000 (01:36 +0200)]
radeonsi: move LLVM ALU codegen into radeonsi

Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
8 years agogenxml: add generated headers to EXTRA_DIST
Jonathan Gray [Sun, 16 Oct 2016 12:08:42 +0000 (23:08 +1100)]
genxml: add generated headers to EXTRA_DIST

Building the Mesa 12.0.3 distfile failed on a system without python
as generated files were not included in the distfile.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
8 years agomesa: automake: include mesa_glinterop.h in distfile
Jonathan Gray [Sun, 16 Oct 2016 12:16:19 +0000 (23:16 +1100)]
mesa: automake: include mesa_glinterop.h in distfile

Add mesa_glinterop.h to the list of headers that will get included
in the distfile as it is required to build Mesa itself.

Corrects a regression introduced in a89faa2022fd995af2019c886b152b49a01f9392.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
8 years agoegl: remove docs directory from EXTRA_DIST
Jonathan Gray [Sun, 16 Oct 2016 10:06:25 +0000 (21:06 +1100)]
egl: remove docs directory from EXTRA_DIST

The egl docs directory no longer exists as of
88b5c36fe1a1546bf633ee161a6715efc593acbd.

Remove it from EXTRA_DIST to unbreak 'make dist'

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
8 years agogenxml: avoid using a GNU make pattern rule
Jonathan Gray [Sun, 16 Oct 2016 05:41:55 +0000 (16:41 +1100)]
genxml: avoid using a GNU make pattern rule

% pattern rules are a GNU extension.  Convert the use of one to a
inference rule to allow this to build on OpenBSD.

This is a related change to the one made in
e3d43dc5eae5271e2c87bab702aa7409d3dd0b23

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
8 years agoconfigure.ac: use a single require_libdrm helper
Emil Velikov [Mon, 12 Sep 2016 17:47:54 +0000 (18:47 +0100)]
configure.ac: use a single require_libdrm helper

Rather than having 4-5 places which do the explicit check/message just
polish the gallium helper and use it everywhere.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoconfigure.ac: remove no longer needed *_pci_id logic
Emil Velikov [Fri, 9 Sep 2016 14:42:31 +0000 (15:42 +0100)]
configure.ac: remove no longer needed *_pci_id logic

Previously it was used to differentiate between the different codepaths
in the loader. Although strictly speaking the (core) of the loader is
only used when a hardware device is available. The latter of which in
itself requires libdrm (one of the codepaths available).

That said, all the configure toggles which relate to enabling/using hw
device should attribute and require libdrm, so there's no need to keep
this code around.

With this gallium_require_drm_loader becomes an empty stub, so nuke that
one as well.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoloader: cleanup copyright section
Emil Velikov [Fri, 9 Sep 2016 15:28:40 +0000 (16:28 +0100)]
loader: cleanup copyright section

With previous patches nearly all the original code (as seen in the
various loaders) is gone.

Update the copyright/license section to reflect that.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoloader: remove loader_get_driver_for_fd() driver_type
Emil Velikov [Mon, 12 Sep 2016 16:48:18 +0000 (17:48 +0100)]
loader: remove loader_get_driver_for_fd() driver_type

Reminiscent from the pre-loader days, were we had multiple instances of
the loader logic in separate places and one could build a "GALLIUM_ONLY"
version.

Since that is no longer the case and the loaders (glx/egl/gbm) do not
(and should not) require to know any classic/gallium specific we can
drop the argument and the related code.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoloader: remove final sysfs codepath in loader_get_device_name_for_fd()
Emil Velikov [Fri, 9 Sep 2016 14:20:23 +0000 (15:20 +0100)]
loader: remove final sysfs codepath in loader_get_device_name_for_fd()

Effectively everyone with actual hardware and/or requesting the
"device_name" requires a working libdrm. Thus they could/should already
be using the (now only) codepath.

Apart from the code simplification, we can slim down our configure.ac
even further. But that will be done in separate patch(es).

Cc: Gary Wong <gtw@gnu.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agotravis: remove no longer needed libudev-dev dependency
Emil Velikov [Wed, 7 Sep 2016 18:06:25 +0000 (19:06 +0100)]
travis: remove no longer needed libudev-dev dependency

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoscons: remove all libudev references
Emil Velikov [Wed, 7 Sep 2016 18:04:55 +0000 (19:04 +0100)]
scons: remove all libudev references

Analogous to previous automake/autoconf commit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoscons: loader: use libdrm when available
Emil Velikov [Wed, 7 Sep 2016 18:03:29 +0000 (19:03 +0100)]
scons: loader: use libdrm when available

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agogbm: remove superfluous/incorrect udev comment
Emil Velikov [Wed, 7 Sep 2016 17:59:08 +0000 (18:59 +0100)]
gbm: remove superfluous/incorrect udev comment

The gbm_device_get_backend_name() provides an (somewhat) internal name
of the implementation/backend used. Is has nothing to do with the udev,
one cannot and should not attempt to derive the name from it.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoautomake: remove all the libudev references
Emil Velikov [Wed, 7 Sep 2016 17:56:36 +0000 (18:56 +0100)]
automake: remove all the libudev references

As of last commit nothing in mesa depends on libudev.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoloader: remove libudev_get_device_name_for_fd and related code
Emil Velikov [Wed, 7 Sep 2016 17:44:47 +0000 (18:44 +0100)]
loader: remove libudev_get_device_name_for_fd and related code

With this all the libudev related code is now gone.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoloader: reimplement loader_get_user_preferred_fd via libdrm
Emil Velikov [Wed, 7 Sep 2016 17:30:48 +0000 (18:30 +0100)]
loader: reimplement loader_get_user_preferred_fd via libdrm

Currently not everyone has libudev and with follow-up patches we'll
completely remove the divergent codepaths.

Use the libdrm drm device API to construct the required ID_PATH_TAG-like
string, to preserve the current functionality for libudev users and
allow others to benefit from it as well.

v2: Drop ranty comments, pick the correct device
v3: \n -> \0 in PCI_ID_PATH_TAG_LENGTH comment (Axel).
v4: Use snprintf (Nicolai)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
8 years agoloader: annotate __driConfigOptionsLoader as static
Emil Velikov [Wed, 7 Sep 2016 15:38:44 +0000 (16:38 +0100)]
loader: annotate __driConfigOptionsLoader as static

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoloader: separate USE_DRICONF code into separate function
Emil Velikov [Wed, 7 Sep 2016 15:36:51 +0000 (16:36 +0100)]
loader: separate USE_DRICONF code into separate function

Improves readability and allows us to do further cleanups a lot easier.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoloader: slim down loader_get_pci_id_for_fd implementation(s)
Emil Velikov [Wed, 7 Sep 2016 15:04:42 +0000 (16:04 +0100)]
loader: slim down loader_get_pci_id_for_fd implementation(s)

Currently mesa has three code paths in the loader - libudev, manual
sysfs and drm ioctl one.

Considering the issues we had with libudev - strip those down in favour
of the libdrm drm device API. The latter can be implemented in any way
depending on the platform and can be reused by others.

v2: Use correct message on drmGetDevice failure. (Nicolai)

Cc: Jonathan Gray <jsg@jsg.id.au>
Cc: Jean-Sébastien Pédron <dumbbell@FreeBSD.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>