mesa.git
5 years agoiris: Plumb through ISL_SWIZZLE_IDENTITY in buffer surface emitters
Kenneth Graunke [Thu, 28 Feb 2019 09:13:33 +0000 (01:13 -0800)]
iris: Plumb through ISL_SWIZZLE_IDENTITY in buffer surface emitters

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoisl: Add a swizzle parameter to isl_buffer_fill_state()
Kenneth Graunke [Thu, 28 Feb 2019 09:13:33 +0000 (01:13 -0800)]
isl: Add a swizzle parameter to isl_buffer_fill_state()

This is necessary for legacy texture buffer object formats, where we'll
need to use a swizzle to fake e.g. luminance.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoiris: fix decode_get_bo callback
Lionel Landwerlin [Thu, 7 Mar 2019 16:59:53 +0000 (16:59 +0000)]
iris: fix decode_get_bo callback

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: acb50d6b1ff1b7 ("intel/decoders: handle decoding MI_BBS from ring")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agovirgl: remove unused variable
Erik Faye-Lund [Wed, 6 Mar 2019 13:43:15 +0000 (14:43 +0100)]
virgl: remove unused variable

This variable is now unused, so let's remove it.

Fixes: 9c4930946a5 (virgl: add encoder functions for new protocol)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agovirgl: remove unused variable
Erik Faye-Lund [Wed, 6 Mar 2019 13:41:54 +0000 (14:41 +0100)]
virgl: remove unused variable

This variable is now unused, so let's remove it.

Fixes: db77573d7ba (virgl: modify how we handle GL_MAP_FLUSH_EXPLICIT_BIT)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agovirgl: remove unused variable
Erik Faye-Lund [Wed, 6 Mar 2019 13:40:04 +0000 (14:40 +0100)]
virgl: remove unused variable

This variable is now unused, so let's remove it.

Fixes: c19aedcf1a8 (virgl: don't mark unclean after a flush)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agovirgl: remove unused variables
Erik Faye-Lund [Wed, 6 Mar 2019 13:36:15 +0000 (14:36 +0100)]
virgl: remove unused variables

These variables are now unused, let's remove them to get rif of a few
warnings.

Fixes: f0e71b10888 (virgl: use transfer queue)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agoiris: fix decoder call
Lionel Landwerlin [Thu, 7 Mar 2019 16:14:13 +0000 (16:14 +0000)]
iris: fix decoder call

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: acb50d6b1ff1b7 ("intel/decoders: handle decoding MI_BBS from ring")
5 years agointel/aub_write: factorize context image/pphwsp/ring creation
Lionel Landwerlin [Mon, 3 Sep 2018 14:11:08 +0000 (15:11 +0100)]
intel/aub_write: factorize context image/pphwsp/ring creation

We allocate GGTT entries and physical addresses are we create engines
rather than having a fixed layout.

Context images now receive a parameter argument which is used to setup
pml4 & ring buffer addresses.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/aub_write: turn context images arrays into functions
Lionel Landwerlin [Mon, 3 Sep 2018 14:10:06 +0000 (15:10 +0100)]
intel/aub_write: turn context images arrays into functions

We'll make them more parameterized in a later commit.

As this is just a transitional commit, we allow ourself to leak the
context images allocated in get_context_init(). We'll fix this in the
next commit.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/aub_write: store the physical page allocator in struct
Lionel Landwerlin [Thu, 23 Aug 2018 23:03:28 +0000 (00:03 +0100)]
intel/aub_write: store the physical page allocator in struct

We want to use this allocator in the next commit for GGTT pages.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/aub_write: log mmio writes
Lionel Landwerlin [Sat, 25 Aug 2018 00:40:29 +0000 (01:40 +0100)]
intel/aub_write: log mmio writes

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/aub_write: switch to use i915_drm engine classes
Lionel Landwerlin [Sun, 26 Aug 2018 13:35:30 +0000 (14:35 +0100)]
intel/aub_write: switch to use i915_drm engine classes

Prepare aub write to deal with multiple engine instances. We don't
pass the instance number yet this could be done in the future by
having a 2 dimensional array of struct engine.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/aub_write: break execlist write in 2
Lionel Landwerlin [Thu, 23 Aug 2018 23:37:03 +0000 (00:37 +0100)]
intel/aub_write: break execlist write in 2

We want to reuse the execlist submission, but won't need the ring
buffer update.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/aub_write: write header in init
Lionel Landwerlin [Sun, 26 Aug 2018 23:19:29 +0000 (00:19 +0100)]
intel/aub_write: write header in init

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/aub_write: split comment section from HW setup
Lionel Landwerlin [Thu, 23 Aug 2018 19:36:16 +0000 (20:36 +0100)]
intel/aub_write: split comment section from HW setup

In the future we'll want error2aub to reuse the context image saved by
i915 instead of the default one we write in intel_dump_gpu.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/aub_read: reuse defines from gen_context
Lionel Landwerlin [Thu, 23 Aug 2018 16:07:08 +0000 (17:07 +0100)]
intel/aub_read: reuse defines from gen_context

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/decoders: limit number of decoded batchbuffers
Lionel Landwerlin [Tue, 4 Sep 2018 14:45:32 +0000 (15:45 +0100)]
intel/decoders: limit number of decoded batchbuffers

IGT has a test to hang the GPU that works by having a batch buffer
jump back into itself, trigger an infinite loop on the command stream.
As our implementation of the decoding is "perfectly" mimicking the
hardware, our decoder also "hangs". This change limits the number of
batch buffer we'll decode before we bail to 100.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/decoders: handle decoding MI_BBS from ring
Lionel Landwerlin [Tue, 28 Aug 2018 10:41:42 +0000 (11:41 +0100)]
intel/decoders: handle decoding MI_BBS from ring

An MI_BATCH_BUFFER_START in the ring buffer acts as a second level
batchbuffer (aka jump back to ring buffer when running into a
MI_BATCH_BUFFER_END).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel/decoders: add address space indicator to get BOs
Lionel Landwerlin [Sun, 26 Aug 2018 12:52:47 +0000 (13:52 +0100)]
intel/decoders: add address space indicator to get BOs

Some commands like MI_BATCH_BUFFER_START have this indicator.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agovulkan/overlay: fix missing var rename in previous commit
Eric Engestrom [Thu, 7 Mar 2019 13:45:14 +0000 (13:45 +0000)]
vulkan/overlay: fix missing var rename in previous commit

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agovulkan/util: use the platform defines in vk.xml instead of hard-coding them
Eric Engestrom [Tue, 5 Mar 2019 15:57:34 +0000 (15:57 +0000)]
vulkan/util: use the platform defines in vk.xml instead of hard-coding them

See also: 3d4238d26c5de4a0f7a5 "anv: use the platform defines in vk.xml
                                instead of hard-coding them"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoiris: add support for tgsi_to_nir
Andre Heider [Sun, 10 Feb 2019 17:31:59 +0000 (18:31 +0100)]
iris: add support for tgsi_to_nir

The Gallium Nine state tracker now works on iris.

Also tested with GALLIUM_HUD and Star Wars: Knights of the Old
Republic on WINE (GL_ATI_fragment_shader).

Signed-off-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir: free dead_ctx in case of no progress
Tapani Pälli [Wed, 6 Mar 2019 10:30:22 +0000 (12:30 +0200)]
nir: free dead_ctx in case of no progress

Fixes a leak:

  ==7576== 320 (48 direct, 272 indirect) bytes in 1 blocks are definitely lost in loss record 26 of 26
  ==7576==    at 0x4C2EE3B: malloc (vg_replace_malloc.c:309)
  ==7576==    by 0x53EF0E4: ralloc_size (ralloc.c:119)
  ==7576==    by 0x53EF0C2: ralloc_context (ralloc.c:113)
  ==7576==    by 0x5471F64: nir_split_per_member_structs (nir_split_per_member_structs.c:176)
  ==7576==    by 0x51288CF: anv_shader_compile_to_nir (anv_pipeline.c:216)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoanv: call blob_finish when done with it
Tapani Pälli [Wed, 6 Mar 2019 10:27:30 +0000 (12:27 +0200)]
anv: call blob_finish when done with it

Fixes leaks from anv_device_upload_nir:

  ==7345== 8,192 bytes in 2 blocks are definitely lost in loss record 24 of 24
  ==7345==    at 0x4C2ED78: malloc (vg_replace_malloc.c:308)
  ==7345==    by 0x4C31393: realloc (vg_replace_malloc.c:836)
  ==7345==    by 0x54E0848: grow_to_fit (blob.c:67)
  ==7345==    by 0x54E0BE5: blob_reserve_bytes (blob.c:166)
  ==7345==    by 0x54E0C7C: blob_reserve_intptr (blob.c:186)
  ==7345==    by 0x54704A7: nir_serialize (nir_serialize.c:1091)
  ==7345==    by 0x512F97D: anv_device_upload_nir (anv_pipeline_cache.c:756)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoanv: use anv_gem_munmap in block pool cleanup
Tapani Pälli [Wed, 6 Mar 2019 08:49:21 +0000 (10:49 +0200)]
anv: use anv_gem_munmap in block pool cleanup

Use anv_gem_munmap for unmap when softpin in use, this corresponds to
anv_gem_mmap used in anv_block_pool_expand_range. This fixes valgrind
errors seen for each pool when softpin is in use:

  ==25581== 262,144 bytes in 1 blocks are definitely lost in loss record 31 of 31
  ==25581==    at 0x50E77E8: anv_gem_mmap (anv_gem.c:96)
  ==25581==    by 0x50EEE2B: anv_block_pool_expand_range (anv_allocator.c:543)
  ==25581==    by 0x50EEB51: anv_block_pool_init (anv_allocator.c:477)
  ==25581==    by 0x50EF7EF: anv_state_pool_init (anv_allocator.c:920)
  ==25581==    by 0x510B8EB: anv_CreateDevice (anv_device.c:2031)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoiris: Fix MOCS for blits and clears
Kenneth Graunke [Wed, 6 Mar 2019 22:49:39 +0000 (14:49 -0800)]
iris: Fix MOCS for blits and clears

I915_MOCS_CACHED is the wrong value.  Expose mocs() and use that.

5 years agost/glsl: start spilling out common st glsl conversion code
Timothy Arceri [Wed, 4 Apr 2018 06:01:21 +0000 (16:01 +1000)]
st/glsl: start spilling out common st glsl conversion code

The NIR and TGSI paths are currently intertwined which makes it
not only hard to follow but also makes it hard to take advantage
of the differences in IR.

Here we take the first step to splitting that path apart. With
this we take the opportunity to no longer call the GLSL IR
optimisation passes after the final lowering calls for NIR. We
can instead just use the NIR passes which can produce better code
and should also result in faster compile times.

The speed-up can be measured in some dolphin uber shaders due to
no longer calling lower_if_to_cond_assign() for example
dolphin/ubershaders/120.shader_test goes from ~1.63 -> ~1.53
seconds on my machine.

There are some code changes as a result of not calling
lower_if_to_cond_assign(), this is because it flattens ifs that
contain UBOs where as NIR's peephole select doesn't. This is
were most of the regressions in Max Waves happens with shader-db.

shader-db results (VEGA):

Totals from affected shaders:
SGPRS: 2349056 -> 2349640 (0.02 %)
VGPRS: 1322160 -> 1323300 (0.09 %)
Spilled SGPRs: 21190 -> 21527 (1.59 %)
Spilled VGPRs: 99 -> 99 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 72 -> 72 (0.00 %) dwords per thread
Code Size: 57260904 -> 57270932 (0.02 %) bytes
Compile Time: 1107186 -> 1022942 (-7.61 %) milliseconds
LDS: 786 -> 786 (0.00 %) blocks
Max Waves: 391932 -> 391619 (-0.08 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoradeonsi/nir: stop calling nir_lower_returns()
Timothy Arceri [Thu, 21 Feb 2019 01:15:18 +0000 (12:15 +1100)]
radeonsi/nir: stop calling nir_lower_returns()

We now call this for all drivers in glsl_to_nir() instead.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoi965: stop calling nir_lower_returns()
Timothy Arceri [Thu, 21 Feb 2019 00:54:09 +0000 (11:54 +1100)]
i965: stop calling nir_lower_returns()

We now call this for all drivers in glsl_to_nir() instead.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoglsl: use NIR function inlining for drivers that use glsl_to_nir()
Timothy Arceri [Wed, 20 Feb 2019 06:13:49 +0000 (17:13 +1100)]
glsl: use NIR function inlining for drivers that use glsl_to_nir()

glsl_to_nir() is still missing support for converting certain
functions to NIR, so for those we use the GLSL IR optimisations
to remove the functions.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoglsl/freedreno/panfrost: pass gl_context to the standalone compiler
Timothy Arceri [Fri, 22 Feb 2019 00:51:24 +0000 (11:51 +1100)]
glsl/freedreno/panfrost: pass gl_context to the standalone compiler

This allows us to use the ctx with glsl_to_nir() in a following
patch.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agovulkan/overlay: drop dependency on validation layer headers
Lionel Landwerlin [Tue, 5 Mar 2019 12:19:10 +0000 (12:19 +0000)]
vulkan/overlay: drop dependency on validation layer headers

v2: reimplement layer chain info getters (Eric)

v3: make it compile.. (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agovulkan/util: generate instance/device dispatch tables
Lionel Landwerlin [Tue, 5 Mar 2019 10:38:14 +0000 (10:38 +0000)]
vulkan/util: generate instance/device dispatch tables

This will be used by the overlay instead of system installed
validation layers helpers.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agovulkan/util: make header available from c++
Lionel Landwerlin [Mon, 25 Feb 2019 23:01:02 +0000 (23:01 +0000)]
vulkan/util: make header available from c++

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoiris: setup EdgeFlag Vertex Element when needed.
Jose Maria Casanova Crespo [Wed, 27 Feb 2019 19:44:27 +0000 (20:44 +0100)]
iris: setup EdgeFlag Vertex Element when needed.

If Vertex Shader uses EdgeFlag the hardware request that it is setup
as the last VERTEX_ELEMENT_STATE. If SGVS are add at draw time we
need to also reconfigure the last 3DSTATE_VF_INSTANCING so its
VertexElementIndex points to the new Vertex Element that contains
the EdgeFlag.

So if draw parameters or edgeflag are not used the CSO generated at
iris_create_vertex_element is sent directly in the batches. But if
edge flag is used we adjust last VERTEX_ELEMENT_STATE and
last 3DSTATE_VF_INSTANCING using their alternative edge flag version
we generate at iris_create_vertex_element and store at the CSO.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agov3d: Include a count of register pressure in the RA failure dumps.
Eric Anholt [Thu, 21 Feb 2019 20:47:37 +0000 (12:47 -0800)]
v3d: Include a count of register pressure in the RA failure dumps.

You usually want to go find the highest pressure and figure out why you
couldn't spill or what pattern led to a bunch of pressure leading to that
point.

5 years agoradv: enable lower_mul_2x32_64
Samuel Pitoiset [Wed, 6 Mar 2019 21:35:31 +0000 (22:35 +0100)]
radv: enable lower_mul_2x32_64

Fixes: 58bcebd987b ("spirv: Allow [i/u]mulExtended to use new nir opcode")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agost/nir: Move 64-bit lowering later
Jason Ekstrand [Mon, 4 Mar 2019 23:02:39 +0000 (17:02 -0600)]
st/nir: Move 64-bit lowering later

Now that we have a loop unrolling cost function and loop unrolling isn't
going to kill us the moment we have a 64-bit op in a loop, we can go
ahead and move 64-bit lowering later.  This gives us the opportunity to
do more optimizations and actually let the full optimizer run even on
64-bit ops rather than hoping one round of opt_algebraic will fix
everything.  This substantially reduces both fp64 shader compile times
and the resulting code size.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/nir: Move 64-bit lowering later
Jason Ekstrand [Mon, 4 Mar 2019 22:11:57 +0000 (16:11 -0600)]
intel/nir: Move 64-bit lowering later

Now that we have a loop unrolling cost function and loop unrolling isn't
going to kill us the moment we have a 64-bit op in a loop, we can go
ahead and move 64-bit lowering later.  This gives us the opportunity to
do more optimizations and actually let the full optimizer run even on
64-bit ops rather than hoping one round of opt_algebraic will fix
everything.  This substantially reduces both fp64 shader compile times
and the resulting code size.  On the vs-isnan-dvec test from piglit:

Before this commit:

    1684.63s user 17.29s system 99% cpu 28:28.24 total
    101479 instructions. 0 loops. 802452 cycles. 79:369 spills:fills.
    Peak memory usage (according to massif): 1.435 GB

After this commit:

    179.64s user 7.75s system 99% cpu 3:07.92 total
    57316 instructions. 0 loops. 459287 cycles. 0:0 spills:fills.
    Peak memory usage (according to massif): 531.0 MB

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir/lower_doubles: Inline functions directly in lower_doubles
Jason Ekstrand [Mon, 4 Mar 2019 21:55:19 +0000 (15:55 -0600)]
nir/lower_doubles: Inline functions directly in lower_doubles

Instead of trusting the caller to already have created a softfp64
function shader and added all its functions to our shader, we simply
take the softfp64 shader as an argument and do the function inlining
ouselves.  This means that there's no more nasty functions lying around
that the caller needs to worry about cleaning up.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir/deref: Expose nir_opt_deref_impl
Jason Ekstrand [Mon, 4 Mar 2019 22:17:02 +0000 (16:17 -0600)]
nir/deref: Expose nir_opt_deref_impl

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir/inline_functions: Break inlining into a builder helper
Jason Ekstrand [Mon, 4 Mar 2019 21:32:36 +0000 (15:32 -0600)]
nir/inline_functions: Break inlining into a builder helper

This pulls the guts of function inlining into a builder helper so that
it can be used elsewhere.  The rest of the infrastructure is still
needed for most inlining cases to ensure that everything gets inlined
and only ever once.  However, there are use-cases where you just want to
inline one little thing.  This new helper also has a neat trick where it
can seamlessly inline a function from one nir_shader into another.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoglsl/nir: Inline functions in float64_funcs_to_nir
Jason Ekstrand [Mon, 4 Mar 2019 20:39:40 +0000 (14:39 -0600)]
glsl/nir: Inline functions in float64_funcs_to_nir

This doesn't really change anything as the functions will all get
inlined anyway.  However it does let us do a bit of the work earlier and
in a common place.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoglsl/nir: Add a shared helper for building float64 shaders
Jason Ekstrand [Sun, 3 Mar 2019 16:00:14 +0000 (10:00 -0600)]
glsl/nir: Add a shared helper for building float64 shaders

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/nir: Drop an unneeded lower_constant_initializers call
Jason Ekstrand [Mon, 4 Mar 2019 22:01:23 +0000 (16:01 -0600)]
intel/nir: Drop an unneeded lower_constant_initializers call

Even though this is technically a step in the function inlining process
as laid out in nir_inline_functions.c, it's not really needed.  We
already have constant initializers lowered here and no new ones are
added by appending the softfp64 functions.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/debug: Add a debug flag to force software fp64
Jason Ekstrand [Sun, 3 Mar 2019 16:10:46 +0000 (10:10 -0600)]
intel/debug: Add a debug flag to force software fp64

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoi965: Compile the fp64 program based on nir options
Jason Ekstrand [Tue, 5 Mar 2019 04:54:44 +0000 (22:54 -0600)]
i965: Compile the fp64 program based on nir options

Instead of looking the devinfo directly, look at the lowering options we
provided to NIR.  This is more accurate as it's now checking for "do we
need full software lowering" rather than a hardware bit.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir: Teach loop unrolling about 64-bit instruction lowering
Jason Ekstrand [Sun, 3 Mar 2019 15:24:12 +0000 (09:24 -0600)]
nir: Teach loop unrolling about 64-bit instruction lowering

The lowering we do for 64-bit instructions can cause a single NIR ALU
instruction to blow up into hundreds or thousands of instructions
potentially with control flow.  If loop unrolling isn't aware of this,
it can unroll a loop 20 times which contains a nir_op_fsqrt which we
then lower to a full software implementation based on integer math.
Those 20 invocations suddenly get a lot more expensive than NIR loop
unrolling currently expects.  By giving it an approximate estimate
function, we can prevent loop unrolling from going to town when it
shouldn't.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir: Expose double and int64 op_to_options_mask helpers
Jason Ekstrand [Fri, 1 Mar 2019 23:39:54 +0000 (17:39 -0600)]
nir: Expose double and int64 op_to_options_mask helpers

We already have one internally for int64 but we don't have a similar one
for doubles so we'll have to make one.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agocompiler/nir: add an is_conversion field to nir_op_info
Iago Toral Quiroga [Tue, 12 Feb 2019 11:55:28 +0000 (12:55 +0100)]
compiler/nir: add an is_conversion field to nir_op_info

This is set to True only for numeric conversion opcodes.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/fs: Fix extract_u8 of an odd byte from a 64-bit integer
Ian Romanick [Wed, 27 Feb 2019 23:53:55 +0000 (15:53 -0800)]
intel/fs: Fix extract_u8 of an odd byte from a 64-bit integer

In the old code, we would generate the exact same instruction for
extract_u8(some_u64, 0) and extract_u8(some_u64, 1).  The mask-a-word
trick only works for even numbered bytes.

This fixes the (new) piglit test
tests/spec/arb_gpu_shader_int64/execution/fs-ushr-and-mask.shader_test.

v2: Use a SHR instead of an AND.  This saves an instruction compared to
using two moves.  Suggested by Jason.

Fixes: 6ac2d169019 ("i965/fs: Fix extract_i8/u8 to a 64-bit destination")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agointel/fs: nir_op_extract_i8 extracts a byte, not a word
Ian Romanick [Wed, 27 Feb 2019 23:52:18 +0000 (15:52 -0800)]
intel/fs: nir_op_extract_i8 extracts a byte, not a word

Fixes: 6ac2d169019 ("i965/fs: Fix extract_i8/u8 to a 64-bit destination")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agointel/compiler: Silence unused parameter warning in brw_interpolation_map.c
Ian Romanick [Tue, 26 Feb 2019 19:24:56 +0000 (11:24 -0800)]
intel/compiler: Silence unused parameter warning in brw_interpolation_map.c

The parameter is never used, and it's not part of a common interface
idiom.  Remove it.

src/intel/compiler/brw_interpolation_map.c: In function ‘brw_setup_vue_interpolation’:
src/intel/compiler/brw_interpolation_map.c:62:59: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
                             const struct gen_device_info *devinfo)
                                                           ^~~~~~~

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agointel/compiler: Silence many unused parameter warnings in brw_eu.h
Ian Romanick [Tue, 26 Feb 2019 19:18:49 +0000 (11:18 -0800)]
intel/compiler: Silence many unused parameter warnings in brw_eu.h

In file included from src/intel/compiler/brw_eu_util.c:34:0:
src/intel/compiler/brw_eu.h: In function ‘brw_message_desc_header_present’:
src/intel/compiler/brw_eu.h:288:63: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_message_desc_header_present(const struct gen_device_info *devinfo,
                                                               ^~~~~~~
src/intel/compiler/brw_eu.h: In function ‘brw_message_ex_desc’:
src/intel/compiler/brw_eu.h:296:51: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_message_ex_desc(const struct gen_device_info *devinfo,
                                                   ^~~~~~~
src/intel/compiler/brw_eu.h: In function ‘brw_message_ex_desc_ex_mlen’:
src/intel/compiler/brw_eu.h:303:59: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_message_ex_desc_ex_mlen(const struct gen_device_info *devinfo,
                                                           ^~~~~~~
src/intel/compiler/brw_eu.h: In function ‘brw_sampler_desc_binding_table_index’:
src/intel/compiler/brw_eu.h:337:68: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_sampler_desc_binding_table_index(const struct gen_device_info *devinfo,
                                                                    ^~~~~~~
src/intel/compiler/brw_eu.h: In function ‘brw_sampler_desc_sampler’:
src/intel/compiler/brw_eu.h:344:56: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_sampler_desc_sampler(const struct gen_device_info *devinfo, uint32_t desc)
                                                        ^~~~~~~
src/intel/compiler/brw_eu.h: In function ‘brw_sampler_desc_return_format’:
src/intel/compiler/brw_eu.h:371:62: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_sampler_desc_return_format(const struct gen_device_info *devinfo,
                                                              ^~~~~~~
src/intel/compiler/brw_eu.h: In function ‘brw_dp_desc_binding_table_index’:
src/intel/compiler/brw_eu.h:405:63: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_dp_desc_binding_table_index(const struct gen_device_info *devinfo,
                                                               ^~~~~~~
src/intel/compiler/brw_eu.h: In function ‘brw_dp_a64_untyped_atomic_desc’:
src/intel/compiler/brw_eu.h:754:41: warning: unused parameter ‘exec_size’ [-Wunused-parameter]
                                unsigned exec_size, /**< 0 for SIMD4x2 */
                                         ^~~~~~~~~
src/intel/compiler/brw_eu.h: In function ‘brw_dp_a64_untyped_atomic_float_desc’:
src/intel/compiler/brw_eu.h:775:47: warning: unused parameter ‘exec_size’ [-Wunused-parameter]
                                      unsigned exec_size,
                                               ^~~~~~~~~

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agomeson: remove unused include_directories(vulkan)
Eric Engestrom [Tue, 20 Nov 2018 13:36:17 +0000 (13:36 +0000)]
meson: remove unused include_directories(vulkan)

The correct include path is "vulkan/…".

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agomeson: fix with_dri2 definition for GNU Hurd
Eric Engestrom [Tue, 5 Mar 2019 12:32:13 +0000 (12:32 +0000)]
meson: fix with_dri2 definition for GNU Hurd

Suggested-by: Dylan Baker <dylan@pnwbakers.com>
Cc: Timo Aaltonen <tjaalton@debian.org>
Cc: James Clarke <jrtc27@debian.org>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agoradv: set num_components on vulkan_resource_index intrinsic
Lionel Landwerlin [Wed, 6 Mar 2019 11:43:56 +0000 (11:43 +0000)]
radv: set num_components on vulkan_resource_index intrinsic

In 61e009d2c4e4df we changed the number of components in the
vulkan_resource_index intrinsic and forgot the update Radv's code for
it.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 61e009d2c4e4df ("spirv: Use the same types for resource indices as pointers")
Reviewed-by: Samuel Pitoiset samuel.pitoiset@gmail.com
5 years agonir: rename glsl_type_is_struct() -> glsl_type_is_struct_or_ifc()
Timothy Arceri [Tue, 5 Mar 2019 05:07:12 +0000 (16:07 +1100)]
nir: rename glsl_type_is_struct() -> glsl_type_is_struct_or_ifc()

Replace done using:
find ./src -type f -exec sed -i -- \
's/glsl_type_is_struct(/glsl_type_is_struct_or_ifc(/g' {} \;

Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoglsl: rename record_types -> struct_types
Timothy Arceri [Tue, 5 Mar 2019 04:58:49 +0000 (15:58 +1100)]
glsl: rename record_types -> struct_types

Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoglsl: rename record_location_offset() -> struct_location_offset()
Timothy Arceri [Tue, 5 Mar 2019 04:55:57 +0000 (15:55 +1100)]
glsl: rename record_location_offset() -> struct_location_offset()

Replace done using:
find ./src -type f -exec sed -i -- \
's/record_location_offset(/struct_location_offset(/g' {} \;

Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoglsl: rename get_record_instance() -> get_struct_instance()
Timothy Arceri [Tue, 5 Mar 2019 04:46:14 +0000 (15:46 +1100)]
glsl: rename get_record_instance() -> get_struct_instance()

Replace done using:
find ./src -type f -exec sed -i -- \
's/get_record_instance(/get_struct_instance(/g' {} \;

Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoglsl: rename is_record() -> is_struct()
Timothy Arceri [Tue, 5 Mar 2019 04:05:52 +0000 (15:05 +1100)]
glsl: rename is_record() -> is_struct()

Replace was done using:
find ./src -type f -exec sed -i -- \
's/is_record(/is_struct(/g' {} \;

Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir/spirv: initial handling of OpenCL.std extension opcodes
Karol Herbst [Thu, 12 Jul 2018 13:02:27 +0000 (15:02 +0200)]
nir/spirv: initial handling of OpenCL.std extension opcodes

Not complete, mostly just adding things as I encounter them in CTS. But
not getting far enough yet to hit most of the OpenCL.std instructions.

Anyway, this is better than nothing and covers the most common builtins.

v2: add hadd proof from Jason
    move some of the lowering into opt_algebraic and create new nir opcodes
    simplify nextafter lowering
    fix normalize lowering for inf
    rework upsample to use nir_pack_bits
    add missing files to build systems
v3: split lines of iadd/sub_sat expressions

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonir/vtn: add support for SpvBuiltInGlobalLinearId
Karol Herbst [Mon, 14 Jan 2019 17:36:37 +0000 (18:36 +0100)]
nir/vtn: add support for SpvBuiltInGlobalLinearId

v2: use formula with fewer operations

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agonir: add support for address bit sized system values
Karol Herbst [Thu, 19 Jul 2018 14:39:58 +0000 (16:39 +0200)]
nir: add support for address bit sized system values

v2: add assert in else clause
    make local group intrinsics 32 bit wide
v3: always use 32 bit constant for local_size
v4: add comment by Jason

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonir/spirv: improve parsing of the memory model
Karol Herbst [Thu, 19 Jul 2018 11:04:14 +0000 (13:04 +0200)]
nir/spirv: improve parsing of the memory model

v2: add some vtn_fail_ifs

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonir: replace magic numbers with M_PI
Karol Herbst [Mon, 4 Mar 2019 18:11:12 +0000 (19:11 +0100)]
nir: replace magic numbers with M_PI

we define it inside 'include/c99_math.h' so it is safe to use.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoanv: Implement VK_EXT_external_memory_host
Caio Marcelo de Oliveira Filho [Fri, 1 Mar 2019 21:15:31 +0000 (13:15 -0800)]
anv: Implement VK_EXT_external_memory_host

v2: Ignore the import if handleType == 0. (Jason)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agov3d: Drop the V3D 3.x vpm read dead code elimination.
Eric Anholt [Wed, 27 Feb 2019 05:37:47 +0000 (21:37 -0800)]
v3d: Drop the V3D 3.x vpm read dead code elimination.

We now have NIR dead code eliminating our VPM reads, so this shouldn't be
necessary.

5 years agov3d: Eliminate the TLB and TLBU files.
Eric Anholt [Wed, 27 Feb 2019 05:34:22 +0000 (21:34 -0800)]
v3d: Eliminate the TLB and TLBU files.

We can just use the magic register file like we do for other magic waddrs.

5 years agov3d: Use ldunif instructions for uniforms.
Eric Anholt [Tue, 26 Feb 2019 18:17:59 +0000 (10:17 -0800)]
v3d: Use ldunif instructions for uniforms.

The idea is that for repeated use of the same uniform, we could avoid
loading it on each consumer.  The results look pretty good.

total instructions in shared programs: 6413571 -> 6521464 (1.68%)
total threads in shared programs: 154214 -> 154000 (-0.14%)
total uniforms in shared programs: 2393604 -> 2119629 (-11.45%)
total spills in shared programs: 4960 -> 4984 (0.48%)
total fills in shared programs: 6350 -> 6418 (1.07%)

Once we do scheduling at the NIR level, the register pressure (and thus
also instructions) issues we see here will drop back down.

5 years agov3d: Add support for register-allocating a ldunif to a QFILE_TEMP.
Eric Anholt [Tue, 26 Feb 2019 17:19:40 +0000 (09:19 -0800)]
v3d: Add support for register-allocating a ldunif to a QFILE_TEMP.

On V3D 4.x, we can use ldunifrf to load uniforms to any register, and this
will let us schedule the ldunif wherever we want in the program.

5 years agov3d: Drop the old class bits splitting up the accumulators.
Eric Anholt [Tue, 26 Feb 2019 17:13:43 +0000 (09:13 -0800)]
v3d: Drop the old class bits splitting up the accumulators.

This seems to be left over from vc4, and I don't use them any more.

5 years agov3d: Add support for vir-to-qpu of ldunif instructions to a temp.
Eric Anholt [Tue, 26 Feb 2019 17:05:05 +0000 (09:05 -0800)]
v3d: Add support for vir-to-qpu of ldunif instructions to a temp.

We can load a uniform to any register, so add support for non-ALU
instructions with sig.ldunif to a temp.

5 years agov3d: Switch implicit uniforms over to being any qinst->uniform != ~0.
Eric Anholt [Wed, 27 Feb 2019 02:36:05 +0000 (18:36 -0800)]
v3d: Switch implicit uniforms over to being any qinst->uniform != ~0.

I'm not sure why I didn't do this before -- it's clearly much simpler to
add dumping of the extra thing than to have it as another implicit source.

5 years agov3d: Do uniform rematerialization spilling before dropping threadcount
Eric Anholt [Tue, 26 Feb 2019 18:49:25 +0000 (10:49 -0800)]
v3d: Do uniform rematerialization spilling before dropping threadcount

This feels like the right tradeoff for threads vs uniforms, particularly
given that we often have very short thread segments right now:

total instructions in shared programs: 6411504 -> 6413571 (0.03%)
total threads in shared programs: 153946 -> 154214 (0.17%)
total uniforms in shared programs: 2387665 -> 2393604 (0.25%)

5 years agov3d: Fix temporary leaks of temp_registers and when spilling.
Eric Anholt [Tue, 26 Feb 2019 18:46:36 +0000 (10:46 -0800)]
v3d: Fix temporary leaks of temp_registers and when spilling.

On each iteration of successfully spilling a reg, we'd allocate another
copy of temp_registers, and when decrementing thread conut we'd allocate
another copy of the graph.  These all got cleaned up on freeing the
compile.

5 years agogitlab-ci: drop job prefixes
Eric Engestrom [Tue, 26 Feb 2019 14:40:29 +0000 (14:40 +0000)]
gitlab-ci: drop job prefixes

It is already obvious whether the job is building a container or running
a mesa build, so let's drop that prefix so that we can see more
information on the screen (eg. in the jobs list on a pipeline page).

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
5 years agotgsi_to_nir: Set correct location for uniforms.
Timur Kristóf [Mon, 4 Mar 2019 14:10:55 +0000 (15:10 +0100)]
tgsi_to_nir: Set correct location for uniforms.

Previously, only the driver_location was set for all variables,
but constants need to use the location field instead. This change
is necessary because the nine state tracker can produce non-packed
constants whose location needs to be explicitly set.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agotgsi_to_nir: Improve interpolation modes.
Timur Kristóf [Tue, 19 Feb 2019 09:11:36 +0000 (10:11 +0100)]
tgsi_to_nir: Improve interpolation modes.

This patch extracts the interpolation mode translation
into a separate function called ttn_translate_interp_mode,
adds support for TGSI_INTERPOLATE_COLOR which was missing,
and also sets the proper interpolation mode to output
variables, which were not set previously.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agotgsi_to_nir: use sampler variables and derefs
Kenneth Graunke [Wed, 6 Feb 2019 11:04:15 +0000 (03:04 -0800)]
tgsi_to_nir: use sampler variables and derefs

v2: fix is_shadow, is_array and txq

Some drivers (eg. iris) need the presence of sampler variables and derefs
so that they can count them to determine the number of samplers used.
This change also makes the output NIR closer to what glsl_to_nir outputs.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agotgsi_to_nir: Support FACE and POSITION properly.
Timur Kristóf [Fri, 8 Feb 2019 21:19:14 +0000 (22:19 +0100)]
tgsi_to_nir: Support FACE and POSITION properly.

Previously, FACE was hard-coded as a sysval, but TTN emulated
it incorrectly. Also, POSITION was not supported when it was
a sysval. This patch fixes these by allowing both of them to
be sysvals or inputs, based on driver capabilities. It also
fixes the TGSI FACE emulation based on the TGSI spec.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agotgsi_to_nir: Extract ttn_emulate_tgsi_front_face into its own function.
Timur Kristóf [Fri, 8 Feb 2019 21:11:08 +0000 (22:11 +0100)]
tgsi_to_nir: Extract ttn_emulate_tgsi_front_face into its own function.

We'll need to use the same logic in other places, so it makes sense to
have a separate function for this.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agotgsi_to_nir: Restructure system value loads.
Timur Kristóf [Fri, 8 Feb 2019 21:15:56 +0000 (22:15 +0100)]
tgsi_to_nir: Restructure system value loads.

Minor cleanup to the way system value loads work in tgsi_to_nir.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agotgsi_to_nir: Produce optimized NIR for a given pipe_screen.
Timur Kristóf [Tue, 5 Mar 2019 17:59:47 +0000 (18:59 +0100)]
tgsi_to_nir: Produce optimized NIR for a given pipe_screen.

With this patch, tgsi_to_nir will output NIR that is tailored to
the given pipe, by reading its capabilities and adjusting the NIR code
to those capabilities similarly to how glsl_to_nir works.

It also adds an optimization loop that brings the output NIR in line
with what glsl_to_nir outputs. This is necessary for the same reason
why glsl_to_nir has its own optimization loop: currently not every
driver does these optimizations yet.

For uses which cannot pass a pipe_screen we also keep a variant
called tgsi_to_nir_noscreen which keeps the old behavior.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Acked-By: Eric Anholt <eric@anholt.net>
5 years agofreedreno: Plumb pipe_screen through to irX_tgsi_to_nir.
Timur Kristóf [Mon, 4 Mar 2019 12:54:10 +0000 (13:54 +0100)]
freedreno: Plumb pipe_screen through to irX_tgsi_to_nir.

This patch makes it possible for freedreno to pass a pipe_screen
to tgsi_to_nir. This will be needed when tgsi_to_nir supports reading
pipe capabilities.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agonir: Add multiplier argument to nir_lower_uniforms_to_ubo.
Timur Kristóf [Thu, 28 Feb 2019 09:53:11 +0000 (10:53 +0100)]
nir: Add multiplier argument to nir_lower_uniforms_to_ubo.

Note that locations can be set in different units, and the multiplier
argument caters to supporting these different units. For example,
st_glsl_to_nir uses dwords (4 bytes) so the multiplier should be 4,
while tgsi_to_nir uses bytes, so the multiplier should be 16.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agonir: Move nir_lower_uniforms_to_ubo to compiler/nir.
Timur Kristóf [Fri, 8 Feb 2019 21:36:37 +0000 (22:36 +0100)]
nir: Move nir_lower_uniforms_to_ubo to compiler/nir.

The nir_lower_uniforms_to_ubo function is useful outside of
mesa/state_tracker, and in fact is needed to produce NIR for
drivers that have the PIPE_CAP_PACKED_UNIFORMS capability.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agotgsi_to_nir: Split to smaller functions.
Timur Kristóf [Fri, 8 Feb 2019 08:59:58 +0000 (09:59 +0100)]
tgsi_to_nir: Split to smaller functions.

Previously, tgsi_to_nir was a single big function, and this patch
intends to make the code easier to understand by splitting it up
to multiple smaller pieces.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-By: Tested-by: Rob Clark <robdclark@gmail.com>
5 years agotgsi_to_nir: Make the TGSI IF translation code more readable.
Timur Kristóf [Wed, 13 Feb 2019 23:01:04 +0000 (01:01 +0200)]
tgsi_to_nir: Make the TGSI IF translation code more readable.

This patch is a minor cleanup that only intends to make the
TGSI IF translation a bit easier to read.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agotgsi_to_nir: Fix TGSI LIT translation by using flt.
Timur Kristóf [Wed, 13 Feb 2019 22:45:47 +0000 (00:45 +0200)]
tgsi_to_nir: Fix TGSI LIT translation by using flt.

TGSI spec says LIT needs a "greater than" comparison. NIR doesn't have that,
so let's use "less than" and swap the arguments. Previously "greater than or equal"
was used by tgsi_to_nir which is incorrect.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agotgsi_to_nir: Fix the TGSI ARR translation by converting the result to int.
Timur Kristóf [Thu, 7 Feb 2019 17:01:24 +0000 (18:01 +0100)]
tgsi_to_nir: Fix the TGSI ARR translation by converting the result to int.

According to the TGSI spec, ARR needs to do a rounding and then
a float-to-integer conversion which was missing. This patch also
makes the rounding a bit more efficient by using nir_fround_even
instead of the previous nir_ffloor+nir_fadd trick.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agonir: Add ability for shaders to use window space coordinates.
Timur Kristóf [Tue, 5 Feb 2019 17:08:24 +0000 (18:08 +0100)]
nir: Add ability for shaders to use window space coordinates.

This patch adds a shader_info field that tells the driver to use window
space coordinates for a given vertex shader. It also enables this feature
in radeonsi (the only NIR-capable driver that supported it in TGSI),
and makes tgsi_to_nir aware of it.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: Move the stores for fixed function VS output reads into NIR.
Eric Anholt [Fri, 22 Feb 2019 22:26:26 +0000 (14:26 -0800)]
v3d: Move the stores for fixed function VS output reads into NIR.

This lets us emit the VPM_WRITEs directly from
nir_intrinsic_store_output() (useful once NIR scheduling is in place so
that we can reduce register pressure), and lets future NIR scheduling
schedule the math to generate them.  Even in the meantime, it looks like
this lets NIR DCE some more code and make better decisions.

total instructions in shared programs: 6429246 -> 6412976 (-0.25%)
total threads in shared programs: 153924 -> 153934 (<.01%)
total loops in shared programs: 486 -> 483 (-0.62%)
total uniforms in shared programs: 2385436 -> 2388195 (0.12%)

Acked-by: Ian Romanick <ian.d.romanick@intel.com> (nir)
5 years agov3d: Translate f2i(fround_even) as FTOIN.
Eric Anholt [Sat, 23 Feb 2019 19:21:26 +0000 (11:21 -0800)]
v3d: Translate f2i(fround_even) as FTOIN.

This appears to be just what the opcode does.  Needed for equivalence when
moving FF VPM stores into NIR.

5 years agonir: Improve printing of load_input/store_output variable names.
Eric Anholt [Sun, 24 Feb 2019 00:17:02 +0000 (16:17 -0800)]
nir: Improve printing of load_input/store_output variable names.

We were printing only when the channel was exactly the start channel, so
scalarized loads/stores would be missing the name on the rest.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agoanv: Implement VK_EXT_inline_uniform_block
Jason Ekstrand [Tue, 12 Feb 2019 22:56:24 +0000 (16:56 -0600)]
anv: Implement VK_EXT_inline_uniform_block

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agospirv: Use the same types for resource indices as pointers
Jason Ekstrand [Sat, 12 Jan 2019 16:58:33 +0000 (10:58 -0600)]
spirv: Use the same types for resource indices as pointers

We need more space than just a 32-bit scalar and we have to burn all
that space anyway so we may as well expose it to the driver.  This also
fixes a subtle bug when UBOs and SSBOs have different pointer types.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agospirv: Use the generic dereference function for OpArrayLength
Jason Ekstrand [Sat, 12 Jan 2019 16:57:28 +0000 (10:57 -0600)]
spirv: Use the generic dereference function for OpArrayLength

With the new deref changes, the old pointer_offset version may not be
the right one to call.  Just call the generic one and let it sort it
out.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>