git.libre-soc.org Git - mesa.git/log

gallium/docs: clairify dmabuf fd ownership

Since debugging issues w/ fd's close()d at the wrong time can be quite
fun, this should probably be made more explicit in the docs.

Signed-off-by: Rob Clark <robclark@freedesktop.org>

android: radeonsi: add support for sid_tables.h generated sources

This patch is necessary to avoid building error on android,
due to missing sid_tables.h generated sources

v2:[Emil Velikov] Correctly split the lists.

Fixes: fbbebeae10f(radeonsi: inline si_cmd_context_control)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

android: Always define __STDC_LIMIT_MACROS.

Analogous to commit 02a4fe22b13 (configure.ac: Always define
__STDC_LIMIT_MACROS.)

v2: [Emil Velikov] keep the LLVM specific __STDC_FORMAT_MACROS

Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

android: rename LLVM_VERSION_PATCH to MESA_LLVM_VERSION_PATCH

Fixes: 797f4eacea8(configure.ac: rename LLVM_VERSION_PATCH to avoid
conflict with llvm-config.h)
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

nouveau: android: add space before PRIx64 macro

Otherwise the android build fails with

error : unable to find string literal operator ‘operator"" PRIx64’

There are several resources referring to the problem, which is related
to c++11, in our case used when building mesa for lollipop.

http://comments.gmane.org/gmane.comp.graphics.opensg.user/5883

I've not investigated all the semantics, some people even suggested a
bug in the gcc compiler,
I just saw the building error was solved with one little space for
lollipop and no side effect when c+11 not used.

v2: [Emil Velikov] add an alternative commit message from Mauro.

Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

svga: pick all the files into the tarball

Signed-off-by: Emil Velikov <emil.velikov@collabora.co.uk>

auxiliary: rework the python generated sources rules

There are a few bits this commit aims to resolve:

One can generalise the mkdir rule to a simple MKDIR_P $(@D) which will
expand appropriately for even if we change the subdir name, and/or add
new rules. We can also drop the explicit $(srcdir) prefix for the
dependency rules, they they are not strictly required, nor used
elsewhere in mesa.

Finally replace $< with explicit filename to be consistent through the
file, and honour PYTHON_FLAGS.

v2: Add comprehensive commit summary/message (Ian, Matt)

Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

glsl: build: remove bogus dependency

v2: rebase on top of the previous commit - don't touch the LOCAL_PATH
prefix for nir_constant_expressions.h

Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

glsl: build: use makefile.sources variables when possible

Rather than folding one variable within the other only to unwrap them,
just use the ones we need.

v2: bring back LOCAL_PATH prefix for nir_constant_expressions,h

Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)

glsl: automake: reuse $(NIR_GENERATED_FILES) where possible

Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

glsl: automake: rework the sources generation rules

The glsl equivalent of "mesa: automake: rework the source generation
rules". Plus let's make things consistent and always explicitly provide
the header name.

v2: Rebase on top of reverted "remove custom AM_V_LEX/YACC" (Matt)

Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

mesa: automake: rework the source generation rules

Same logic as previous commit applies.

Additionally remove the odd (set -e/mv/INDENT) from the rules.
The last one is the only one we remotely care about, if reading the
generated sources.

Upcoming work from DylanB which will replace the existing python
scripts with ones that produce more readable output anyway.

Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

mapi: automake: rework the source generation rules

Same logic as previous commit applies. Also fix bogus MESA_MAPI_DIR -
the sources are located in the source dir (duh).

Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

mapi: automake: rework the *api/glapi_mapi_tmp.h rules

Same logic as previous commit applies.

v2: Merge with "inline glapi_gen_mapi define" (Matt)

Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

util: automake: rework the format_srgb.c rule

A handful of changes/cleanups paving the way to bmake support:
- Remove optional $(srcdir)/ prefix for files in the prereq list.
- Drop the space after the AM_V_GEN variable.
- Using $< in a non-suffix rule is a GNU make idiom.
- Use $(@D) over $(dir $@). The latter is a POSIX standard.

v2: Cosmetic tweaks in the commit summary.

Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)

xmlpool: 'promote' LOCALEDIR variable

This is the only place in mesa that uses this constuct which seems
to be GNUmake-ism. Attempting to build with POSIX make implementations
(bmake) would fail as below.

--- options.h ---
LOCALEDIR := .
sh: line 2: LOCALEDIR: command not found
*** [options.h] Error code 127

So let's keep things consistent and compatible by making the variable
non target specific.

v2:
- Bring back LOCALEDIR.
- Reword the commit message
- Change mesa-stable tag 10.6 > 11.0

Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Cc: Jonathan Gray <jsg@jsg.id.au>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

egl_dri2: Add support for EGL_KHR_create_contest when using swrast

This requires swrast version >= 3. Also EGL_EXT_create_context_robostness
is supported if __DRI2_ROBUSTNESS extension is found.

Reference: https://bugs.freedesktop.org/show_bug.cgi?id=80821
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>

egl_dri2: Use createContextAttribs if swrast version >= 3

v2: Change return type of the new function from int to bool

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>

egl_dri2: Move filling context_attrib array in a separate function

v2: Change return type of the new function from int to bool

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>

mesa: Allow query of GL_VERTEX_BINDING_BUFFER

According to OpenGL ES 3.1 specification table : 20.2 and
OpenGL specification 4.4 table 23.4. The glGetIntegeri_v
functions should report the name of the buffer bound
when called with GL_VERTEX_BINDING_BUFFER.

Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>

mesa/es3.1: Enable GL_MAX_VERTEX_ATTRIB enums for GLES 3.1

Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>

i965/nir: Use nir_system_value_from_intrinsic to reduce duplication.

This code is all pretty much identical. We just needed the translation
from one enum value to the other.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>

nir: Add a nir_system_value_from_intrinsic() function.

This converts NIR intrinsics that load system values into Mesa's
SYSTEM_VALUE_* enumerations.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>

i965: Mark topologies with adjacency information as G45+.

These didn't exist on the original 965.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>

i965: Fix value of _3DPRIM_TRIFAN_NOSTIPPLE.

TRIFAN_NOSTIPPLE has always been 0x16 - 0x15 is marked "Reserved" on all
platforms. See the 965 PRM, Volume 2, Table 3-1, "3D Primitive Topology
Type Encoding" for a list.

We don't currently use this, and I don't expect we will, but we may as
well not leave the bogus value around.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>

i965: Add 64-bit dirty flag handling to brw_upload_pull_constants

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

i965: Add defines for all new Gen7/8 URB opcodes

Tessellation needs to emit URB reads and atomics;

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

i965/gen8+: Skip depth stalls on state change

Docs suggest this is no longer required starting with Gen8.

Perf (no regressions in n=20)
OglMultithread       0.67%
OglTerrainPanInst    0.12%
trex                 0.45%
warsow               0.64%

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

r600: don't use shader key without verifying shader type (v2)

Since 7a32652231f96eac14c4bfce02afe77b4132fb77
r600: Turn 'r600_shader_key' struct into union

we were accessing key fields that might be aliased in the union
with other fields, so we should check what shader type we are
compiling for before using key values from it.

v1.1: make it compile
v2: have caffeine, make it work - we don't set type
until later, so don't reference it until we've set it.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>

i965/skl: Use more compact hiz dimensions

I meant to do this here, but it was in the wrong place:

commit c1151b18f2dce7c6f238f057e9c4fa8d912ce6b5
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date: Wed Jun 24 20:07:54 2015 -0700

i965/skl: Use more compact hiz dimensions

NOTE: Jordan did go back and look at the original mailing list post. I mailed
the right thing, and pushed the wrong one.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Neil Roberts <neil@linux.intel.com>

st/mesa: increase viewport bounds limits for GL4 hw

According to the ARB_viewport_array spec, GL4 limit is higher than the
GL3 limit. Also take this opportunity to fix the GL3 limit.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

nvc0: always emit a full shader colormask

Indications are that if the colormask indicates a single bit set on
fermi, that value will always be read from $r0 instead of a potentially
higher register (if e.g. green is set). Not to upset the counting logic,
always set the header up with a full color mask for each RT. Such a
situation can basically only ever happen with generated blit shaders.

Fixes the following piglit on Fermi (Kepler is unaffected):
fbo-stencil blit GL_DEPTH32F_STENCIL8

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>

docs: fix date formatting in index.html

nir: UBO loads no longer use const_index[1]

Commit 2126c68e5cba killed the array elements parameter on load/store
intrinsics that was stored in const_index[1]. It looks like that
patch missed to remove this assignment in the UBO path.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

nv30: Fix max width / height checks in nv30 sifm code

The sifm object has a limit of 1024x1024 for its input size and 2048x2048
for its output. The code checking this was trying to be clever resulting
in it seeing a surface of e.g 1024x256 being outside of the input size
limit.

This commit fixes this.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>

i965: Disallow fast blit paths for CopyTexImage with PixelTransfer ops

glCopyTexImage behaves similarly to glReadPixels with respect to the
pixel transfer operations. Therefore if any are set we cannot use the
simple blit-only fast paths.

(Though if would be possible to relax the blorp path to handle
pixel zoom, or we can just enhance meta.)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviwewed-by: Iago Toral <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org

mesa/tests: Remove unneeded X11_CFLAGS

X11_CFLAGS is never defined. Path to X11 headers is not needed here, so
just remove.

Future work: Using AM_CFLAGS here looks wrong, as this Makefile only builds
C++ files

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

glxl/tests: Use X11_INCLUDES instead of X11_CFLAGS

X11_CFLAGS is undefined, so these tests will fail to build if x11proto is
installed in a non-standard location.

(See also commits 35189d76, bc93c3798, 54b028ba, d901d7e08, etc.)

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

svga: Fix surface view error handling

Make sure errors are correcly propagated.
Also don't flush during state emission if emission fails.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

xa: add xa_surface_from_handle2 v2

Like xa_surface_from_handle(), but takes a handle type, rather than
hard-coding 'shared' handle.  This is needed to fix bugs seen with
xf86-video-freedreno with xrandr rotation, for example.  The root issue
is that doing a GEM_OPEN ioctl on a bo that already has a GEM handle
associated with the drm_file will result in two unique handles for the
same bo.  Which causes all sorts of follow-on fail.

v2:
- Add support for for fd handles.
- Avoid duplicating code.
- Bump xa version minor.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>

i965/nir/vec4: removed unneeded tex src swizzle set

At that point the swizzle should be correct.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

util: make mesa-sha1.c completely empty when there are no SHA1 impls

My earlier attempt to fix this missed the fact that there was a #else
clause that assumes that you have openssh. This moves the whole thing
under #ifdef HAVE_SHA1 which should avoid this issue.

Fixes: 13bfa5201 (util: always include sha1 into the build)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91898
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@gmail.com>

util: always include sha1 into the build

SHA1 is now used in all builds when HAVE_SHA1 is defined. Adjust src to
do the same thing, rather than predicating on shader cache.

Fixes: 04e201d0c02 ("mesa: change 'SHADER_SUBST' facility to work with env variables")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@gmail.com>

st/mesa: don't fall back to 16F when 32F is requested

Nothing in the spec allows for the reduced precision, and this also
fixes st_QuerySamplesForFormat for nv50, which does not allow MS8 on
RGBA32F. Now this will be respected instead of reporting MS8 as
supported with an assumption that the format used will be RGBA16F.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

st/mesa: properly handle u_upload_alloc failure

vbuf is never null. We want to make sure that a resource was allocated
for the vbuf, which is *vbuf.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

nouveau: don't mark full range as used on unmap with explicit flush

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org

nv50: avoid using inline vertex data submit when gl_VertexID is used

The hardware only generates vertexid when vertices come from a VBO. This
fixes:

vertexid-drawelements
vertexid-drawarrays

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>

nv50: don't flush vertex arrays when index buffer changes

The index buffer is fed in inline over a pushbuf. It's not related to
vertices or any caching that might be done on them.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org

nv50: rebind bo to bufctx when invalidating idxbuf storage

There is nothing to be done on a dirty idxbuf, but the bo may have
changed, so we have to rebind it to the bufctx.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org

nv50: clear buffer status on all vertex bufs, not just the first one

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org

nv50: fix drawing from tfb, direct-to-pushbuf submits

The stride was being set to 0, which is illegal (and also non-sensical).
Also we must wait for the buffer to become available for reading as
otherwise a wrong value may be prefetched. Since we must wait for the
buffer anyways, and it's mapped and in GART, we may as well avoid the
annoyance of the indirect pushbuf submit.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org

i965: Remove base miplevel from sampler state.

Gen9 changes the meaning of this to coarse LOD quality mode. Although that's a
desirable thing to be setting, it doesn't match the gen8 behavior and this was
unintentional. More importantly, we don't ever use this field. So instead of
getting it "wrong" drop it entirely.

This is a respin of a patch which only [incorrectly] tried to address gen9.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

docs: add news item and link release notes for 10.6.6

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

docs: add sha256 checksums for 10.6.6

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit e3e2a3e0e581da39dcd9268951edb52f68916940)

docs: add release notes for 10.6.6

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 4b05739e9d718a48415270b95c0a73b56666c364)

llvmpipe: convert double to long long instead of unsigned long long

round(val*dscale) produces a double result, as val and dscale are double.
However, LLVMConstInt receives unsigned long long, so there is an
implicit conversion from double to unsigned long long.
This is an undefined behavior. Therefore, we need to first explicitly
convert the round result to long long, and then let the compiler handle
conversion from that to unsigned long long.

This bug manifests itself in POWER, where all IMM values of -1 are being
converted to 0 implicitly, causing a wrong LLVM IR output.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>

nv30: Implement color resolve for msaa

Note this is not ideal. Since the sifm can only do source sizes upto
1024x1024 we end up using the blitter on nv4x, which is not that fast.

And on nv3x we end up using the cpu which is really slow.

Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>

nv30: Fix creation of scanout buffers

Scanout buffers on nv30 must always be non-swizzled and have special
width alignment constraints.

These constrains have been taken from the xf86-video-nouveau
src/nv_accel_common.c: nouveau_allocate_surface() function.

nouveau_allocate_surface() applies these width constraints only when a
tiled attribute is set, which it sets for all surfaces allocated via
dri, and this "tiling" is not the same as swizzling, scanout surfaces
must be linear / have a uniform_pitch or only complete garbage is shown.

This commit fixes dri3 on nv30 showing a garbled display, with dri3 the
scanout buffers are allocated by mesa, rather then by the ddx, and the
wrong stride of these buffers was causing the garbled display.

Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>

vc4: Initialize pack field of qreg to 0 in qir_get_temp

This avoids generation of undefined packing in qir and qpu instructions,
fixing a lot of rendering errors.

Fixes 8b36d107fdd (vc4: Pack the unorm-packing bits into a src MUL
instruction when possible.)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

i965: Disallow PixelTransfer operations for tiled-memcpy TexImage/ReadPixels

The tiled memcpy fast paths perform a simple blit (with only a couple of
trivial pixel conversion routines) and do not accommodate PixelTransfer
operations. Therefore if any are set, fallback to the regular routines.
Note that PixelTransfer only applies to TexImage and ReadPixels, not to
GetTexImage.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org

i965/vec4: Don't unspill the same register in consecutive instructions

If we have spilled/unspilled a register in the current instruction, avoid
emitting unspills for the same register in the same instruction or consecutive
instructions following the current one as long as they keep reading the spilled
register. This should allow us to avoid emitting costy unspills that come with
little benefit to register allocation.

v2:
  - Apply the same logic when evaluating spilling costs (Curro).

v3:
  - Abstract the logic that decides if a register can be reused in a function.
    that can be used from both spill_reg and evaluate_spill_costs (Curro).

v4:
  - Do not disallow reusing scratch_reg in predicated reads (Curro).
  - Track if previous sources in the same instruction read scratch_reg (Curro).
  - Return prev_inst_read_scratch_reg at the end (Curro).
  - No need to explicitily skip scratch read/write opcodes in spill_reg (Curro).
  - Fix the comments explaining what happens when we hit an instruction that
    does not read or write scratch_reg (Curro)
  - Return true early when the current or previous instructions read
    scratch_reg with a compatible mask.

v5:
  - Do not return true early, the loop should not be expensive anyway
    and this adds more complexity (Curro).

Reviewed-by: Francisco Jerez <currojerez@riseup.net>

i965: Add a debug option for spilling everything in vec4 code

Reviewed-by: Francisco Jerez <currojerez@riseup.net>

dri/common: Tokenize driParseDebugString() argument before matching debug flags.

Fixes debug string parsing when one of the supported flags is a
substring of another.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>

dri/common: Fix codestyle of driParseDebugString().

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>

glsl: error out on ES 3.1 if VS or FS present but not both

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>

glsl: error on linking if no shaders are attached to program

This applies to OpenGL Core >= 4.5 and OpenGL ES >= 3.1.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>

i965: Improve disassembly of data port read messages.

We now print out the name of the message instead of its numerical
value, and label the message control and surface numbers.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>

i965: Optimize VUE map comparisons.

The entire VUE map is computed based on the slots_valid bitfield;
calling brw_compute_vue_map on the same bitfield will return the
same result. So we can simply compare those.

struct brw_vue_map is 136 bytes; doing a single 8-byte comparison is
much cheaper and should work just as well.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>

i965/gs: Don't reserve space for clip plane uniforms.

These were only for legacy userclipping, which we no longer support
in geometry shaders.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>

i965: Don't do legacy userclipping in non-compatibility contexts.

According to the GLSL 1.50 specification, page 76:
"The shader must also set all values in gl_ClipDistance that have been
enabled via the OpenGL API, or results are undefined."

With this patch, we only enable clip distance writes when the shader
actually writes them. We no longer force a value to be written when
clip planes are enabled in the API. This could mean the first varying
slot would be used as clip distances - I believe it should be the safe
kind of undefined behavior.

Empirically, it doesn't seem to cause a problem.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>

i965: Remove the brw_vue_prog_key base class.

The legacy userclip fields are only used for the vertex shader, and at
that point there's only program_string_id and the tex struct, which are
common to all keys. So there's no need for a "VUE" key base class.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>

i965: Virtualize vec4_visitor::emit_urb_slot().

This avoids a downcast of key, which won't exist in the base class soon.

I'm not a huge fan of this patch, but given that we're currently using
inheritance, this seems like the "right" way to do it. The alternative
is to make key a void pointer in the parent class and continue
downcasting.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>

i965: Store a key_tex pointer in vec4_visitor.

I'm about to remove the base class for VS/GS/HS/DS program keys, at
which point we won't be able to use key->tex anymore. Instead, we'll
need to store a direct pointer (like we do in the FS backend).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>

i965: Move legacy clip plane handling to vec4_vs_visitor.

This is now only used for the vertex shader, so it makes sense to get it
out of any paths run by the geometry shader.

Instead of passing the gl_clip_plane array into the run() method (which
is shared among all subclasses), we add it as a vec4_vs_visitor
constructor parameter. This eliminates the bogus NULL parameter in the
GS case.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>

i965: Delete the brw_vue_program_key::userclip_active flag.

There are two uses of this flag.

The primary use is checking whether we need to emit code to convert
legacy gl_ClipVertex/gl_Position clipping to clip distances.  In this
case, we also have to upload the clip planes as uniforms, which means
setting nr_userclip_plane_consts to a positive value.  Checking if it's
> 0 works for detecting this case.

Gen4-5 also wants to know whether we're doing clipping at all, so it can
emit user clip flags.  Checking if output_reg[VARYING_SLOT_CLIP_DIST0]
is set to a real register suffices for this.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>

i965: Remove legacy clip plane handling from geometry shaders.

We only support geometry shaders in core profiles, where gl_ClipVertex
doesn't exist.  Presumably the even older behavior of clipping to
gl_Position isn't supported either.  In fact, GLSL 1.50 page 76 claims:

"The shader must also set all values in gl_ClipDistance that have been
enabled via the OpenGL API, or results are undefined."

So we don't need to handle legacy clipping in geometry shaders.  I think
Paul added this back when we were considering supporting the old
GL_ARB_geometry_shader4 extension.

This removes a non-orthagonal state dependency on GS compilation.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>

i965: Move brw_setup_tex_for_precompile to brw_program.[ch].

This living in brw_fs.{h,cpp} is a historical artifact of us supporting
texturing for fragment shaders before any other stages. It's kind of
awkward given that we use it for all stages.

This avoids having to include brw_fs.h in geometry shader code in order
to access this function.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>

mesa: change 'SHADER_SUBST' facility to work with env variables

Patch modifies existing shader source and replace functionality to work
with environment variables rather than enable dumping on compile time.
Also instead of _mesa_str_checksum, _mesa_sha1_compute is used to avoid
collisions.

Functionality is controlled via two environment variables:

MESA_SHADER_DUMP_PATH - path where shader sources are dumped
MESA_SHADER_READ_PATH - path where replacement shaders are read

v2: cleanups, add strerror if fopen fails, put all functionality
inside HAVE_SHA1 since sha1 is required

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Suggested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

build: add HAVE_SHA1 define when using --with-sha1 option

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>

i965: Fix copy propagation type changes.

commit 472ef9a02f2e5c5d0caa2809cb736a0f4f0d4693 introduced code to
change the types of SEL and MOV instructions for moves that simply
"copy bits around".  It didn't account for type conversion moves,
however.  So it would happily turn this:

   mov(8) vgrf6:D, -vgrf5:D
   mov(8) vgrf7:F, vgrf6:UD

into this:

   mov(8) vgrf6:D, -vgrf5:D
   mov(8) vgrf7:D, -vgrf5:D

which erroneously drops the conversion to float.

Cc: "11.0 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

r600: fix loop overrun in cayman_mul_double_instr

Coverity warned about this. Ilia pointed it out.

Signed-off-by: Dave Airlie <airlied@redhat.com>

i965/gen9: Annotate input coverage mask change

As far as I can tell, the behavior is preserved from the previous generations.
Before we set a single bit to tell the FS whether or not we'll be using an input
coverage mask. Now we have some options which are implementing various
extensions. These bits are used for the various conservative rasterization
mechanisms (for collision detection, binning, and whatever else).

I believe that the behavior is preserved because the problem which conservative
rasterization is attempting to fix would go away with the "NORMAL" mode (at the
cost of performance, I believe).

This patch serves as documentation of the change by creating the enums, as well
as giving some of the history with the links here so that the next person who
comes along and looks at it doesn't spend as long as I had to in order to
determine if there is an issue or not.

Previously, this algorithm had been done in software, and this can still be used
as long as we don't export an extension stating otherwise.

References: https://www.opengl.org/registry/specs/NV/conservative_raster.txt
References: https://http.developer.nvidia.com/GPUGems2/gpugems2_chapter42.html
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

svga: update call to u_upload_alloc()

u_upload_alloc() no longer returns a return value.

Trivial.

winsys/radeon: remove exported buffers from the cache

Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

winsys/amdgpu: remove exported buffers from the cache

Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

gallium/pb_bufmgr_cache: add a way to remove buffers from the cache explicitly

This must be done before exporting a buffer as dmabuf fds, because
we lose track of who is using it and can't trust the reference counter.

Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

u_upload_mgr: remove the return value from u_upload_data

Reviewed-by: Brian Paul <brianp@vmware.com>

u_upload_mgr: remove the return value from u_upload_buffer

Reviewed-by: Brian Paul <brianp@vmware.com>

u_upload_mgr: remove the return value from u_upload_alloc_buffer

Reviewed-by: Brian Paul <brianp@vmware.com>

u_upload_mgr: remove the return value from u_upload_alloc

The return buffer or the returned pointer can be used instead.

Reviewed-by: Brian Paul <brianp@vmware.com>

u_upload_mgr: optimize u_upload_alloc

This is probably the most called util function. It does almost nothing,
yet it can consume 10% of the CPU on the profile. This drops it down to 5%.

Reviewed-by: Brian Paul <brianp@vmware.com>

gallium/radeon: remove 'dirty' member from r600_atom

It's no longer used by both r600 and radeonsi now.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>

r600g: simplify dirty atom tracking

Now that R600_NUM_ATOMS is below 64, dirty atom tracking can be
simplified.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>

r600g: start numbering atoms from 1

There doesn't seem any reason to start from 4.
Start from 1 instead (0 is left reserved to catch uninitialized atoms).

Signed-off-by: Marek Olšák <marek.olsak@amd.com>

r600g: make all viewport states use single atom

Similarly to scissor states, we can use single atom to track all viewport
states. This will allow to simplify dirty atom handling later.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>

r600g: apply disable workaround on all scissors

During review of the "r600g: make all scissor states use single atom" patch
Marek Olšák noticed that scissor disable workaround should be applied on
all scissor states and not just first one, so let's do so.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>

r600g: make all scissor states use single atom

As suggested by Marek Olšák, we can use single atom to track all scissor
states. This will allow to simplify dirty atom handling later.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>

mesa/pbo: Handle zero width, height or depth when validating access

It's legal to call glTexSubImage with zero values for the width,
height or depth. Previously this was breaking the PBO access
validation because it tries to work out the last pixel accessed by
getting the pixel at height-1 and depth-1 which would end up with
bogus values.

This was causing GL errors to be generated during the Piglit
texsubimage test, although the test was passing anyway.

v2: Also check for width == 0. Don't validate the start pointer if any
of the dimensions are zero.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>

glsl: Remove unused total_attribs_size variable.

Accidentally left behind by my previous patch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>

glsl: Handle attribute aliasing in attribute storage limit check.

In various versions of OpenGL and GLSL, it's possible to declare
multiple VS input variables with aliasing attribute locations.

So, when computing the storage requirements for vertex attributes,
we can't simply add up the sizes. Instead, we need to look at the
enabled slots.

This patch begins tracking which attributes are double types that
are larger than 128-bits (i.e. take up two vec4 slots). We then
count normal attributes once, and count the double-size attributes
a second time.

Fixes deQP functional.attribute_location.bind_aliasing.max_cond_* tests
on i965, which regressed with commit ad208d975a6d3aebe14f7c2c16039ee20.

No Piglit changes on llvmpipe (which actually supports dvecs).

Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>