mesa.git
7 years agor600: document some of the missing shader constants.
Dave Airlie [Mon, 5 Jun 2017 03:25:29 +0000 (13:25 +1000)]
r600: document some of the missing shader constants.

These are used for fragment shader thread calculations.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
7 years agor600: add register info for atomic counters.
Dave Airlie [Mon, 5 Jun 2017 03:24:12 +0000 (13:24 +1000)]
r600: add register info for atomic counters.

The atomic counters on evergreen are implemented via append/consume
UAV counters. This just adds the register info for them. The EOS
packets are used to get the atomic totals extracted post shader
execution for storing into a buffer.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
7 years agor600: add missing RAT registers and operations.
Dave Airlie [Mon, 5 Jun 2017 03:22:07 +0000 (13:22 +1000)]
r600: add missing RAT registers and operations.

This just documents in the headers the RAT operation list,
and the RAT encoding for exports.

The immediate registers are used to point to buffers for the
RAT return values (_RTN instructions).

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
7 years agor600/sb: fix typo in field definitions
Dave Airlie [Mon, 5 Jun 2017 19:38:34 +0000 (05:38 +1000)]
r600/sb: fix typo in field definitions

Pointed out by glennk.

7 years agotgsi/scan: fix scanning fragment shaders with PrimID and Position/Face
Marek Olšák [Tue, 30 May 2017 00:04:29 +0000 (02:04 +0200)]
tgsi/scan: fix scanning fragment shaders with PrimID and Position/Face

Not relevant to radeonsi, because Position/Face are system values
with radeonsi, while this codepath is for drivers where Position and
Face are ordinary inputs.

Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agoi965: Finalize miptrees before prepare_texture
Jason Ekstrand [Fri, 26 May 2017 17:57:33 +0000 (10:57 -0700)]
i965: Finalize miptrees before prepare_texture

In order to do resolves for texture views with different formats, we
need intel_texture_object::_Format to be valid.  Calling
intel_finalize_mipmap_tree can safely be done multiple times in a row
and should be a fairly cheap operation.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agogallium/u_threaded: remove 16 bytes from tc_batch
Marek Olšák [Tue, 30 May 2017 23:46:40 +0000 (01:46 +0200)]
gallium/u_threaded: remove 16 bytes from tc_batch

All other sentinels occupy what is otherwise unused space.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
7 years agogallium/u_threaded: align batches and call slots to 16 bytes
Marek Olšák [Tue, 30 May 2017 23:32:01 +0000 (01:32 +0200)]
gallium/u_threaded: align batches and call slots to 16 bytes

not sure if this helps

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
7 years agost/mesa: don't load cached TGSI shaders on demand
Marek Olšák [Wed, 31 May 2017 11:07:04 +0000 (13:07 +0200)]
st/mesa: don't load cached TGSI shaders on demand

This fixes a performance issue with the shader cache that delayed Gallium
shader create calls until draw calls.

I'd like this in stable, but it's not a showstopper.

Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agoAndroid: use bionic pthread_barrier_* if possible
Chih-Wei Huang [Sun, 4 Jun 2017 04:53:01 +0000 (12:53 +0800)]
Android: use bionic pthread_barrier_* if possible

The pthread_barrier_* functions were introduced to bionic
since Nougat.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
7 years agor600: fix incorrect and missing bit field in register headers.
Dave Airlie [Mon, 5 Jun 2017 03:19:18 +0000 (13:19 +1000)]
r600: fix incorrect and missing bit field in register headers.

The compression field was incorrect, and we were missing the
depth before shader field.

7 years agoradv: use ac_compute_surface
Nicolai Hähnle [Thu, 11 May 2017 23:46:46 +0000 (01:46 +0200)]
radv: use ac_compute_surface

Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: prepare fmask surface creation
Dave Airlie [Sun, 14 May 2017 23:43:25 +0000 (09:43 +1000)]
radv: prepare fmask surface creation

The old code copied over all the surface info from the image
surface, we only want some bits of it, and to modify the flags.

This prevents a regression in dEQP-VK.api.copy_and_blit.resolve_image.*
and others in the subsequent switch to ac_compute_surface.

v2:
- also disable opt4Space in radv_amdgpu_surface, so that we can
  apply this patch separately *before* switching to ac_compute_surface
  and hopefully avoid intermittent regressions (Nicolai)

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradv: use amdgpu_addr_create
Nicolai Hähnle [Thu, 11 May 2017 23:38:49 +0000 (01:38 +0200)]
radv: use amdgpu_addr_create

Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: stop using radv_amdgpu_winsys::family
Nicolai Hähnle [Thu, 11 May 2017 23:38:30 +0000 (01:38 +0200)]
radv: stop using radv_amdgpu_winsys::family

Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: use ac_gpu_info
Nicolai Hähnle [Thu, 11 May 2017 23:11:27 +0000 (01:11 +0200)]
radv: use ac_gpu_info

Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: remove radeon_info::name
Nicolai Hähnle [Thu, 11 May 2017 22:56:06 +0000 (00:56 +0200)]
radv: remove radeon_info::name

Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: use ac_surface data structures
Nicolai Hähnle [Wed, 10 May 2017 21:01:00 +0000 (23:01 +0200)]
radv: use ac_surface data structures

This is mostly mechanical changes of renaming types and introducing
"legacy" everywhere.

It doesn't use the ac_surface computation functions yet.

Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: rename radeon_surf::bo_{size,alignment} to surf_{size,alignment}
Nicolai Hähnle [Wed, 10 May 2017 20:41:36 +0000 (22:41 +0200)]
radv: rename radeon_surf::bo_{size,alignment} to surf_{size,alignment}

To match radeonsi / ac_surface.

Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: remove unused RADEON_SURF_HAS_SBUFFER_MIPTREE
Nicolai Hähnle [Wed, 10 May 2017 20:33:13 +0000 (22:33 +0200)]
radv: remove unused RADEON_SURF_HAS_SBUFFER_MIPTREE

Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: remove radeon_surf_level::nblk_z
Nicolai Hähnle [Wed, 10 May 2017 20:25:15 +0000 (22:25 +0200)]
radv: remove radeon_surf_level::nblk_z

We're not using thick tiling modes, so we can just derive the value
ourselves.

Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: remove radeon_surf_level::dcc_enabled
Nicolai Hähnle [Wed, 10 May 2017 20:20:37 +0000 (22:20 +0200)]
radv: remove radeon_surf_level::dcc_enabled

Like radeonsi; replace with radeon_surf::num_dcc_levels.

Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: remove radeon_surf_level::pitch_bytes
Nicolai Hähnle [Wed, 10 May 2017 20:14:39 +0000 (22:14 +0200)]
radv: remove radeon_surf_level::pitch_bytes

Like radeonsi. This saves memory, and the information can easily be
recomputed on the fly where necessary.

Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: add surface helper variable in radv_GetImageSubresourceLayout
Nicolai Hähnle [Wed, 10 May 2017 20:05:52 +0000 (22:05 +0200)]
radv: add surface helper variable in radv_GetImageSubresourceLayout

Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: fewer than 8 RBs are possible
Nicolai Hähnle [Tue, 16 May 2017 15:05:02 +0000 (17:05 +0200)]
radv: fewer than 8 RBs are possible

This fixes the subsequent assertion on Bonaire.

Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoac/surface/gfx6: explicitly support S8 surfaces
Nicolai Hähnle [Tue, 16 May 2017 14:38:27 +0000 (16:38 +0200)]
ac/surface/gfx6: explicitly support S8 surfaces

This is needed by radv for dEQP-VK.renderpass.simple.stencil

Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoac/nir: set workgroup size attribute to correct value.
Dave Airlie [Mon, 5 Jun 2017 00:20:48 +0000 (01:20 +0100)]
ac/nir: set workgroup size attribute to correct value.

This ports: 55445ff1891724c78e6573d2f8c721e14c0449fc from radeonsi

    radeonsi: tell LLVM not to remove s_barrier instructions

    LLVM 5.0 removes s_barrier instructions if the max-work-group-size
    attribute is not set. What a surprise.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoac: add new helper function to add a integer target dependent function attr.
Dave Airlie [Mon, 5 Jun 2017 00:20:10 +0000 (01:20 +0100)]
ac: add new helper function to add a integer target dependent function attr.

This is needed to add the max workgroup size attribute.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: add external memory support.
Dave Airlie [Thu, 3 Nov 2016 04:16:43 +0000 (04:16 +0000)]
radv: add external memory support.

This adds support for exporting 2D images, to an
opaque fd.

This implements the:
VK_KHX_external_memory_capabilities
VK_KHX_external_memory
VK_KHX_external_memory_fd

extensions.

These are used by SteamVR, we should work with anv
to decide if we should ship these under an env
var or something.

v2 (Bas): - Don't expose the semaphore ext without implementing it.
          - Only export the capabilities ext as instance ext.
          - Implement radv_GetPhysicalDeviceExternalBufferPropertiesKHX.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
7 years agoradv: Add VkPhysicalDeviceIDProperties support.
Bas Nieuwenhuizen [Tue, 23 May 2017 07:22:09 +0000 (09:22 +0200)]
radv: Add VkPhysicalDeviceIDProperties support.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: Add support for external queue family.
Bas Nieuwenhuizen [Mon, 22 May 2017 21:50:13 +0000 (23:50 +0200)]
radv: Add support for external queue family.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv/formats: reverse how the image format properties KHR2 is handled
Dave Airlie [Tue, 14 Mar 2017 23:40:17 +0000 (23:40 +0000)]
radv/formats: reverse how the image format properties KHR2 is handled

This just aligns with how anv does it.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: Dirty all descriptors sets when changing the pipeline.
Bas Nieuwenhuizen [Fri, 2 Jun 2017 22:01:36 +0000 (00:01 +0200)]
radv: Dirty all descriptors sets when changing the pipeline.

Sets could have been ignored during previous descriptor set flush
due to the shader not using them and therefore no SGPR being assigned.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fixes: ae61ddabe8c "radv: move userdata sgpr ownership to compiler side."
7 years agoradv: Set both compute and graphics SGPRS on descriptor set flush.
Bas Nieuwenhuizen [Fri, 2 Jun 2017 21:51:50 +0000 (23:51 +0200)]
radv: Set both compute and graphics SGPRS on descriptor set flush.

We clear the descriptors_dirty array afterwards, so the SGPRs for
the other pipeline don't get updated on the flush for that other
draw/dispatch, so we have to make sure we do it immediately.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fixes: ae61ddabe8c "radv: move userdata sgpr ownership to compiler side."
7 years agoi965: Order write of query availablity with earlier writes
Chris Wilson [Thu, 6 Oct 2016 20:07:18 +0000 (21:07 +0100)]
i965: Order write of query availablity with earlier writes

Currently we signal the availabilty of the query result using an
unordered pipe-control write. As it is unordered, it may be executed
before the write of the query result itself - and so an observer may
read the query result too early. Fix this by requesting that the write
of the availablity flag is ordered after earlier pipe control writes.

Testcase: piglit/arb_query_buffer_object-qbo/*async*
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
7 years agonvc0: Add support for ARB_post_depth_coverage
Lyude [Wed, 24 May 2017 19:42:41 +0000 (15:42 -0400)]
nvc0: Add support for ARB_post_depth_coverage

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
7 years agost/mesa: Add support for ARB_post_depth_coverage
Lyude [Wed, 24 May 2017 19:42:40 +0000 (15:42 -0400)]
st/mesa: Add support for ARB_post_depth_coverage

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
7 years agogallium: Add a cap to check if the driver supports ARB_post_depth_coverage
Lyude [Wed, 24 May 2017 19:42:39 +0000 (15:42 -0400)]
gallium: Add a cap to check if the driver supports ARB_post_depth_coverage

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
7 years agogallium: Add TGSI shader token for ARB_post_depth_coverage
Lyude [Wed, 24 May 2017 19:42:38 +0000 (15:42 -0400)]
gallium: Add TGSI shader token for ARB_post_depth_coverage

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
7 years agonvc0: disable BGRA8 images on Fermi
Lyude [Sat, 3 Jun 2017 00:45:36 +0000 (20:45 -0400)]
nvc0: disable BGRA8 images on Fermi

BGRA8 image stores on Fermi don't work, which results in breaking
PBO downloads, such that they always return 0x0. Discovered this
through a glamor bug, and confirmed it does indeed break a good number
of piglit tests such as spec/arb_pixel_buffer_object/pbo-read-argb8888

Fixes: 8e7893eb53213 ("nvc0: add support for BGRA8 images")
Signed-off-by: Lyude <lyude@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
7 years agoi965: Simplify l3 way size computations
Anuj Phogat [Thu, 1 Jun 2017 23:36:39 +0000 (16:36 -0700)]
i965: Simplify l3 way size computations

By making use of l3_banks field in gen_device_info struct
l3_way_size for gen7+ = 2 * l3_banks.

V2: Keep the get_l3_way_size() function.

Suggested-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agoi965: Add and initialize l3_banks field for gen7+
Anuj Phogat [Thu, 1 Jun 2017 16:28:04 +0000 (09:28 -0700)]
i965: Add and initialize l3_banks field for gen7+

This new field helps simplify l3 way size computations
in next patch.

V2: Initialize the l3_banks to 0 in macros.

Suggested-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agoi965: Replace 0 with ISL_FORMAT_UNSUPPORTED in format table (v2)
Chad Versace [Sat, 27 May 2017 01:48:28 +0000 (18:48 -0700)]
i965: Replace 0 with ISL_FORMAT_UNSUPPORTED in format table (v2)

When given an *unsupported* mesa_format,
brw_isl_format_for_mesa_format() returned 0, a *valid* isl_format,
ISL_FORMAT_R32G32B32A32_FLOAT.  The problem is that
brw_isl_format_for_mesa_format's inner table used 0 instead of
ISL_FORMAT_UNSUPPORTED to indicate unsupported mesa formats.

Some callers of brw_isl_format_for_mesa_format() were aware of this
weirdness, and worked around it. This patch removes those workarounds.

v2: Ensure that all array elements are initialized to
  ISL_FORMAT_UNSUPPORTED, even when new formats are added to enum
  mesa_format, by using an designated range initializer.

Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agost/dri: Use fence extension in drisw.c
Gurchetan Singh [Tue, 23 May 2017 00:34:32 +0000 (17:34 -0700)]
st/dri: Use fence extension in drisw.c

This is desirable for synchronization in virtual machines.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agost/dri: move fence implemention into separate file
Gurchetan Singh [Tue, 23 May 2017 00:33:22 +0000 (17:33 -0700)]
st/dri: move fence implemention into separate file

Since the fence implementation is not dri2.c specific, put
it in a separate file. This way SW implementations can use this
extension too.

v2: Don't depend on dri2.c for extensions (Emil)
v3: Make this patch only move extension into a separate file (Chad).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agomesa: document range of SampleCoverageValue, MinSampleShadingValue
Brian Paul [Thu, 1 Jun 2017 20:19:15 +0000 (14:19 -0600)]
mesa: document range of SampleCoverageValue, MinSampleShadingValue

Trivial.

7 years agoxlib: fix glXGetCurrentDisplay() failure
Brian Paul [Mon, 22 May 2017 17:46:27 +0000 (11:46 -0600)]
xlib: fix glXGetCurrentDisplay() failure

glXGetCurrentDisplay() has been broken for years and nobody noticed until
recently.  This change adds a new XMesaGetCurrentDisplay() that the GLX
emulation API can call, just as we did for glXGetCurrentContext().

Tested by hacking glxgears to call glXGetCurrentContext() before and
after glXMakeCurrent() to verify the return value is NULL beforehand and
the same as the opened display afterward.

Also tested by Tom Hudson with his tests programs.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100988
Cc: mesa-stable@lists.freedesktop.org
Tested-by: Tom Hudson <tom.hudson.phd@gmail.com>
Signed-off-by: Brian Paul <brianp@vmware.com>
7 years agoradv: realign cp dma code with radeonsi
Dave Airlie [Thu, 1 Jun 2017 04:40:10 +0000 (05:40 +0100)]
radv: realign cp dma code with radeonsi

This reworks this code to be like radeonsi, which will make it
easier to add GFX9 support to it in the future.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: bump some base addresses to 64-bits.
Dave Airlie [Thu, 1 Jun 2017 04:32:25 +0000 (05:32 +0100)]
radv: bump some base addresses to 64-bits.

For GFX9 these will be needed to be 64-bit, so bump them early,
to avoid it causing any wierdness later.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: factor out eop event writing code. (v2)
Dave Airlie [Thu, 1 Jun 2017 04:24:34 +0000 (05:24 +0100)]
radv: factor out eop event writing code. (v2)

In prep for GFX9 refactor some of the eop event writing code
out.

This changes behaviour, but aligns with what radeonsi does,
it does double emits on CIK/VI, whereas previously it only
did this on CIK.

v2: bump the size checks.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: factor out si_emit_wait_fence code.
Dave Airlie [Thu, 1 Jun 2017 04:12:19 +0000 (05:12 +0100)]
radv: factor out si_emit_wait_fence code.

This code was in a few places, consolidate into one.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agointel/blorp: Handle gen6 stencil/HiZ offsets in the back-end
Jason Ekstrand [Tue, 30 May 2017 16:53:43 +0000 (09:53 -0700)]
intel/blorp: Handle gen6 stencil/HiZ offsets in the back-end

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agointel/isl: Add a helper for getting the byte/tile offset of a subimage
Jason Ekstrand [Tue, 30 May 2017 16:42:25 +0000 (09:42 -0700)]
intel/isl: Add a helper for getting the byte/tile offset of a subimage

Frequently, get_image_offset_sa is combined with get_intratile_offset_sa
so it makes sense to have a single helper to do both.  If the caller
doesn't want the intratile offsets, it can simply pass NULL and ISL will
assert that they are 0.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agointel/isl: Make get_intratile_offset_el take the element size in bits
Jason Ekstrand [Thu, 18 May 2017 21:00:48 +0000 (14:00 -0700)]
intel/isl: Make get_intratile_offset_el take the element size in bits

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agointel/isl: Add a new layout for HiZ and stencil on Sandy Bridge
Jason Ekstrand [Tue, 30 May 2017 04:45:00 +0000 (21:45 -0700)]
intel/isl: Add a new layout for HiZ and stencil on Sandy Bridge

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agointel/isl: Generate phys_total_el from isl_calc_phys_extent
Jason Ekstrand [Tue, 30 May 2017 15:28:47 +0000 (08:28 -0700)]
intel/isl: Generate phys_total_el from isl_calc_phys_extent

The only surface layout for which slice0 makes any sense is GEN4_2D.
Move all of the slice0 stuff into isl_calc_phys_total_extent_el_gen4_2d
and make the others trivially return the total size in surface elements.
As a side-effect, array_pitch_el_rows is now returned from these helpers
as well.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agointel/isl: Don't check array pitch for gen4 3D textures
Jason Ekstrand [Tue, 30 May 2017 15:45:36 +0000 (08:45 -0700)]
intel/isl: Don't check array pitch for gen4 3D textures

Array pitch doesn't matter in this layout.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agointel/isl: Refactor to use a phys_total_el extent.
Jason Ekstrand [Tue, 30 May 2017 15:18:42 +0000 (08:18 -0700)]
intel/isl: Refactor to use a phys_total_el extent.

We've already implicitly been using a physical total size in surface
elements.  This just centralizes things a bit.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agointel/isl: Add an isl_assert_div helper
Jason Ekstrand [Tue, 30 May 2017 15:18:57 +0000 (08:18 -0700)]
intel/isl: Add an isl_assert_div helper

This is a fairly common operation and it's nice to be able to just call
the one little function.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agointel/isl: Refactor isl_calc_array_pitch_el_rows
Jason Ekstrand [Tue, 30 May 2017 05:12:36 +0000 (22:12 -0700)]
intel/isl: Refactor isl_calc_array_pitch_el_rows

Over 90% of the function only applies to ISL_DIM_LAYOUT_GEN4_2D anyway
so we can just handle the other two as special cases at the top.  The
two "generic" cases below the switch only apply on gen9 and above and
only to 3D or CCS surfaces.  This implies that they only apply to
surfaces with ISL_DIM_LAYOUT_GEN4_2D.  Making them look generic is a
lie.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agointel/isl: Move isl_calc_array_pitch_el_rows higher up
Jason Ekstrand [Tue, 30 May 2017 05:10:29 +0000 (22:10 -0700)]
intel/isl: Move isl_calc_array_pitch_el_rows higher up

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agointel/isl: Remove the device parameter from isl_tiling_get_info
Jason Ekstrand [Tue, 30 May 2017 03:32:26 +0000 (20:32 -0700)]
intel/isl: Remove the device parameter from isl_tiling_get_info

We were only using it for validating that we don't use Ys/Yf on gen8 and
earlier.  Removing it from isl_tiling_get_info lets us remove it from a
bunch of other things that had no business needing a hardware
generation.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965: Rework Sandy Bridge HiZ and stencil layouts
Jason Ekstrand [Sat, 27 May 2017 17:36:23 +0000 (10:36 -0700)]
i965: Rework Sandy Bridge HiZ and stencil layouts

Sandy Bridge does not technically support mipmapped depth/stencil.  In
order to work around this, we allocate what are effectively completely
separate images for each miplevel, ensure that they are page-aligned,
and manually offset to them.  Prior to layered rendering, this was a
simple matter of setting a large enough halign/valign.

With the advent of layered rendering, however, things got more
complicated.  Now, things weren't as simple as just handing a surface
off to the hardware.  Any miplevel of a normally mipmapped surface can
be considered as just an array surface given the right qpitch.  However,
the hardware gives us no capability to specify qpitch so this won't
work.  Instead, the chosen solution was to use a new "all slices at each
LOD" layout which laid things out as a mipmap of arrays rather than an
array of mipmaps.  This way you can easily offset to any of the
miplevels and each is a valid array.

Unfortunately, the "all slices at each lod" concept missed one
fundamental thing about SNB HiZ and stencil hardware:  It doesn't just
always act as if you're always working with a non-mipmapped surface, it
acts as if you're always working on a non-mipmapped surface of the same
size as LOD0.  In other words, even though it may only write the
upper-left corner of each array slice, the qpitch for the array is for a
surface the size of LOD0 of the depth surface.  This mistake causes us
to under-allocate HiZ and stencil in some cases and also to accidentally
allow different miplevels to overlap.  Sadly, piglit test coverage
didn't quite catch this until I started making changes to the resolve
code that caused additional HiZ resolves in certain tests.

This commit switches Sandy Bridge HiZ and stencil over to a new scheme
that lays out the non-zero miplevels horizontally below LOD0.  This way
they can all have the same qpitch without interfering with each other.
Technically, the miplevels still overlap, but things are spaced out
enough that each page is only in the "written area" of one LOD.

Cc: "17.0 17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965: Drop duplicate shadow variable.
Kenneth Graunke [Thu, 1 Jun 2017 19:22:01 +0000 (12:22 -0700)]
i965: Drop duplicate shadow variable.

We already initialized this at the top of the function.

Trivial.

7 years agoautomake: Link all libGL.so variants with -Bsymbolic.
Jose Fonseca [Thu, 1 Jun 2017 15:41:13 +0000 (16:41 +0100)]
automake: Link all libGL.so variants with -Bsymbolic.

We were linking src/glx with -Bsymbolic, but not the classic/gallium X11
libGL.so.

But it's always a good idea to build all libGL.so and all DRI drivers
with -Bsymbolic, otherwise they might resolve symbols from the 3rd party
application executable or shared libraries, which is _never_ what we
want.

In particular, this can happen when intercepting OpenGL calls with
apitrace, before
https://github.com/apitrace/apitrace/commit/63194b2573176ef34efce1a5c8b08e624b8dddf5

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoi965/dri: Fix bad GL error in intel_create_winsys_renderbuffer()
Chad Versace [Sat, 27 May 2017 00:28:21 +0000 (17:28 -0700)]
i965/dri: Fix bad GL error in intel_create_winsys_renderbuffer()

This function never occurs in the callchain of a GL function. It occurs
only in the callchain of eglCreate*Surface and the analogous paths for
GLX.  Therefore, even if a  thread does have a bound GL context,
emitting a GL error here is wrong. A misplaced GL error, when no GL
call is made, can confuse clients.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
7 years agoi965: Cleanup in intel_create_winsys_renderbuffer()
Chad Versace [Sat, 27 May 2017 00:26:07 +0000 (17:26 -0700)]
i965: Cleanup in intel_create_winsys_renderbuffer()

Combine variable declarations and assignments.
Trivial cleanup.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
7 years agoi965: Remove bad assert on isl_format
Chad Versace [Sat, 27 May 2017 01:44:14 +0000 (18:44 -0700)]
i965: Remove bad assert on isl_format

translate_tex_format() asserted that isl_format != 0. But 0 is a valid
format, ISL_FORMAT_R32G32B32A32_FLOAT.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoi965: Fix return type of translate_tex_format()
Chad Versace [Sat, 27 May 2017 01:33:21 +0000 (18:33 -0700)]
i965: Fix return type of translate_tex_format()

It returns an isl_format, not GLuint BRW_FORMAT.  I updated every
translate_tex_format() found by git-grep.

No change in behavior.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoi965: Fix return type of brw_isl_format_for_mesa_format() [v2]
Chad Versace [Sat, 27 May 2017 01:22:40 +0000 (18:22 -0700)]
i965: Fix return type of brw_isl_format_for_mesa_format() [v2]

It returns an isl_format, not uint32_t BRW_FORMAT.
I updated every brw_isl_format_for_mesa_format() found by git-grep.

No change in behavior.

v2: Rebased atop Anuj's patch, which has some of the same fixes.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
7 years agoi965: Remove an extra semicolon
Anuj Phogat [Wed, 24 May 2017 20:57:57 +0000 (13:57 -0700)]
i965: Remove an extra semicolon

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
7 years agoi965: Rename brw_format variable names to isl_format
Anuj Phogat [Thu, 27 Apr 2017 17:22:40 +0000 (10:22 -0700)]
i965: Rename brw_format variable names to isl_format

This patch makes non functional changes. Renaming is just to
make the code more readable.

V2: update the types to "enum isl_format"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
7 years agoi965: Reject unsupported formats in glEGLImageTargetTexture2D()
Chad Versace [Tue, 30 May 2017 16:53:28 +0000 (09:53 -0700)]
i965: Reject unsupported formats in glEGLImageTargetTexture2D()

If the EGLImage's format is not a supported texture format according to
brw_surface_formats.c, then refuse to create the miptree. This follows
the precedent in glEGLImageRenderbufferStorage (implemented by
intel_image_target_renderbuffer_storage), which rejects the EGLImage's
format if is not renderable.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agogenxml: Make 3DSTATE_CONSTANT_BODY on Gen7+ use arrays.
Kenneth Graunke [Mon, 15 May 2017 23:15:13 +0000 (16:15 -0700)]
genxml: Make 3DSTATE_CONSTANT_BODY on Gen7+ use arrays.

This will let us initialize the constant buffers with loops.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agogenxml: Fix decoder to print the array element on field members.
Kenneth Graunke [Fri, 19 May 2017 22:41:31 +0000 (15:41 -0700)]
genxml: Fix decoder to print the array element on field members.

Previously we'd print things like:

   0xfffbb568:  0x00010000 : Dword 1
       ReadLength: 0
       ReadLength: 1
   0xfffbb568:  0x00000001 : Dword 1
       ReadLength: 1
       ReadLength: 0

instead of the more obvious:

   0xfffbb568:  0x00010000 : Dword 1
       ReadLength[0]: 0
       ReadLength[1]: 1
   0xfffbb568:  0x00000001 : Dword 1
       ReadLength[2]: 1
       ReadLength[3]: 0

(Yes, the ralloc context here is bogus - the decoder leaks just about
everything.  We need to use proper ralloc contexts someday...)

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agogenxml: Fix decoding of array groups.
Kenneth Graunke [Fri, 19 May 2017 22:31:35 +0000 (15:31 -0700)]
genxml: Fix decoding of array groups.

If you had a group as the first element of a struct, i.e.

  <struct name="3DSTATE_CONSTANT_BODY" length="10">
    <group count="4" start="0" size="16">
      <field name="ReadLength" start="0" end="15" type="uint"/>
    </group>
    ...
  </struct>

we would get a group_offset of 0, causing create_field() to think the
field wasn't in a group, and fail to offset forward for successive array
elements.  So we'd mark all the array elements as offset 0.

Using ctx->group->elem_size is a better check for "are we in a group?".

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agogenxml: Fix decoder for groups with multiple fields.
Kenneth Graunke [Fri, 19 May 2017 22:25:21 +0000 (15:25 -0700)]
genxml: Fix decoder for groups with multiple fields.

If you have something like:

    <group count="0" start="96" size="32">
      <field name="Entry_0" start="0" end="15" type="GATHER_CONSTANT_ENTRY"/>
      <field name="Entry_1" start="16" end="31" type="GATHER_CONSTANT_ENTRY"/>
    </group>

We would reset ctx->group_count to 0 after processing the first field,
so the second would not have a group count.

This is largely untested, as the only groups with multiple fields are
packets we don't emit in Mesa.  Found by inspection.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agogenxml: Fix parsing of address fields in groups.
Kenneth Graunke [Mon, 15 May 2017 23:53:25 +0000 (16:53 -0700)]
genxml: Fix parsing of address fields in groups.

For example,

    <group count="4" start="64" size="64">
      <field name="Pointer" start="5" end="63" type="address"/>
    </group>

used to generate:

   const uint64_t v2_address =
      __gen_combine_address(data, &dw[2], values->Pointer, 0);
   ...
   const uint64_t v4_address =
      __gen_combine_address(data, &dw[4], values->Pointer, 0);
   ...

but now generates code with proper subscripts:

   const uint64_t v2_address =
      __gen_combine_address(data, &dw[2], values->Pointer[0], 0);
   ...
   const uint64_t v4_address =
      __gen_combine_address(data, &dw[4], values->Pointer[1], 0);
   ...

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agoconfigure.ac: simplify --enable-libunwind=auto check
Eric Engestrom [Thu, 1 Jun 2017 14:06:57 +0000 (15:06 +0100)]
configure.ac: simplify --enable-libunwind=auto check

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoutil/rand_xor: add missing include statements
Nicolas Dechesne [Thu, 1 Jun 2017 10:13:18 +0000 (12:13 +0200)]
util/rand_xor: add missing include statements

Fixes for:

src/util/rand_xor.c:60:13: error: implicit declaration of function 'open' [-Werror=implicit-function-declaration]
    int fd = open("/dev/urandom", O_RDONLY);
             ^~~~
src/util/rand_xor.c:60:34: error: 'O_RDONLY' undeclared (first use in this function)
    int fd = open("/dev/urandom", O_RDONLY);
                                  ^~~~~~~~

Signed-off-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
7 years agoetnaviv: always do cpu_fini in transfer_unmap
Lucas Stach [Thu, 18 May 2017 13:39:58 +0000 (15:39 +0200)]
etnaviv: always do cpu_fini in transfer_unmap

The cpu_fini() call pushes the buffer back into the GPU domain, which needs
to be done for all buffers, not just the ones with CPU written content. The
etnaviv kernel driver currently doesn't validate this, but may start to do
so at a later point in time. If there is a temporary resource the fini needs
to happen before the RS uses this one as the source for the upload.

Also remove an invalid comment about flushing CPU caches, cpu_fini takes
care of everything involved in this.

Fixes: c9e8b49b885 ("etnaviv: gallium driver for Vivante GPUs")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
7 years agodocs: update calendar, add news item and link release notes for 17.0.7
Emil Velikov [Thu, 1 Jun 2017 10:46:39 +0000 (11:46 +0100)]
docs: update calendar, add news item and link release notes for 17.0.7

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agodocs: add sha256 checksums for 17.0.7
Emil Velikov [Thu, 1 Jun 2017 10:41:56 +0000 (11:41 +0100)]
docs: add sha256 checksums for 17.0.7

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit bdfd5658e7cd4c6925afa06bb858c0601865a1ea)

7 years agodocs: add release notes for 17.0.7
Emil Velikov [Thu, 1 Jun 2017 10:34:38 +0000 (11:34 +0100)]
docs: add release notes for 17.0.7

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 46cc7a1746e03b1672c8508af49eb60546d5b61d)

7 years agoglsl: fix a crash in ir_print_visitor() for bindless samplers/images
Samuel Pitoiset [Thu, 25 May 2017 17:12:12 +0000 (19:12 +0200)]
glsl: fix a crash in ir_print_visitor() for bindless samplers/images

Bindless samplers/images are represented with 64-bit unsigned
integers and they can be assigned with explicit constructors.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
7 years agoglsl: teach opt_array_splitting about bindless images
Samuel Pitoiset [Thu, 25 May 2017 16:55:09 +0000 (18:55 +0200)]
glsl: teach opt_array_splitting about bindless images

Memory/format layout qualifiers shouldn't be lost when arrays
of images are splitted by this pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agoglsl: teach opt_structure_splitting about images in structures
Samuel Pitoiset [Thu, 25 May 2017 16:36:35 +0000 (18:36 +0200)]
glsl: teach opt_structure_splitting about images in structures

GL_ARB_bindless_texture allows images to be declared inside
structures, but when memory/format qualifiers are used, they
should be propagated when structures are splitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agoglsl: fix broken indentation in do_structure_splitting()
Samuel Pitoiset [Thu, 25 May 2017 16:29:50 +0000 (18:29 +0200)]
glsl: fix broken indentation in do_structure_splitting()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agoglsl: handle format layout qualifiers for struct with array of images
Samuel Pitoiset [Thu, 25 May 2017 14:26:42 +0000 (16:26 +0200)]
glsl: handle format layout qualifiers for struct with array of images

This handles a situation like:

struct {
   layout (r32f) image2D imgs[6];
} s;

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agoglsl: handle memory qualifiers for struct with array of images
Samuel Pitoiset [Thu, 25 May 2017 14:19:58 +0000 (16:19 +0200)]
glsl: handle memory qualifiers for struct with array of images

This handles a situation like:

struct {
   image2D imgs[6];
} s;

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agonvc0: Clean up unnecessary includes from gallium/auxiliary/vl/
Rhys Kidd [Wed, 31 May 2017 22:48:09 +0000 (18:48 -0400)]
nvc0: Clean up unnecessary includes from gallium/auxiliary/vl/

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
7 years agoi965: Simplify SO_DECL handling.
Kenneth Graunke [Tue, 28 Feb 2017 22:05:55 +0000 (14:05 -0800)]
i965: Simplify SO_DECL handling.

We can initialize structs directly, avoid some temporaries, and cut out
about half of the skip component handling.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
7 years agoi965: Make a local for linked_xfb->Outputs[i], to shorten things.
Kenneth Graunke [Tue, 28 Feb 2017 21:38:35 +0000 (13:38 -0800)]
i965: Make a local for linked_xfb->Outputs[i], to shorten things.

This seems a bit more readable.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
7 years agoi965: Move SOL PSIZ hacks from draw time to link time.
Kenneth Graunke [Tue, 28 Feb 2017 20:29:43 +0000 (12:29 -0800)]
i965: Move SOL PSIZ hacks from draw time to link time.

We can just update the gl_transform_feedback_info fields at link time
to make the VUE header fields have the right location and component.
Then we don't need to handle them specially at draw time, which is
expensive.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
7 years agomesa/main: replace remaining uses of IROUND() in GetUniform*() by round()
Iago Toral Quiroga [Thu, 18 May 2017 09:43:57 +0000 (11:43 +0200)]
mesa/main: replace remaining uses of IROUND() in GetUniform*() by round()

These were correct since they were used only in conversions to signed integers,
however this makes the implementation a bit more is more consistent and reduces
chances of propagating use of these macros to unsigned cases in the future, which
would not be correct.

Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agomesa/main: conversion from float in GetUniformi64v requires rounding to nearest
Iago Toral Quiroga [Thu, 18 May 2017 09:43:56 +0000 (11:43 +0200)]
mesa/main: conversion from float in GetUniformi64v requires rounding to nearest

As we do for all other cases of float/double conversions to integers.

v2: use round() instead of IROUND() macros (Iago)

Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agomesa/main: Add conversion from double to uint64/int64 in GetUniform*i64v()
Iago Toral Quiroga [Thu, 18 May 2017 09:43:55 +0000 (11:43 +0200)]
mesa/main: Add conversion from double to uint64/int64 in GetUniform*i64v()

v2:
  - need unsigned rounding for double->uint64 conversion (Nicolai)
  - use round() instead of IROUND() macros (Iago)

Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agomesa/main: Clamp GetUniformui64v values to be >= 0
Iago Toral Quiroga [Thu, 18 May 2017 09:43:54 +0000 (11:43 +0200)]
mesa/main: Clamp GetUniformui64v values to be >= 0

Like we do for the 32-bit case.

v2:
  - need unsigned rounding for float->uint64 conversion (Nicolai)
  - use roundf() instead of IROUND() macros (Iago)

Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agomesa/main: Clamp GetUniformuiv values to be >= 0
Kenneth Graunke [Thu, 18 May 2017 09:43:53 +0000 (11:43 +0200)]
mesa/main: Clamp GetUniformuiv values to be >= 0

Section 2.2.2 (Data Conversions For State Query Commands) of the
OpenGL 4.5 October 24th 2016 specification says:

"If a command returning unsigned integer data is called, such as
 GetSamplerParameterIuiv, negative values are clamped to zero."

v2: uint to int conversion should clamp to INT_MAX (Nicolai)

v3 (Iago)
  - Add conversions conversions from 64-bit integer paths
  - Rebase on master

v4:
  - need unsigned rounding for float/double->uint conversions (Nicolai)
  - use round{f}() instead of IROUND() macros (Iago)

Fixes:
KHR-GL45.gpu_shader_fp64.state_query

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v2)
Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agomesa/main: fix indentation in _mesa_get_uniform()
Iago Toral Quiroga [Thu, 18 May 2017 09:43:52 +0000 (11:43 +0200)]
mesa/main: fix indentation in _mesa_get_uniform()

v2: also change the style of the large conditional in that function
    to follow the style from most other parts of Mesa (Nicolai)

Reviewed-by: Matt Turner <mattst88@gmail.com>