Jason Ekstrand [Tue, 6 Sep 2016 22:02:31 +0000 (15:02 -0700)]
nir/spirv: Swap the argument order for AtomicCompareExchange
SPIR-V has the two arguments in the opposite order from GLSL. NIR uses the
GLSL order so we had them backwards.
Fixes dEQP-VK.spirv_assembly.instruction.compute.opatomic.compex
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Tim Rowley [Wed, 17 Aug 2016 15:45:37 +0000 (10:45 -0500)]
vbo: increase VBO_SAVE_BUFFER_SIZE from 8k to 256k dwords
Increases the performance of legacy geometry-heavy apps
still using display lists.
Performance increase for a targeted testcase is on the
order of 8x, and applications like ParaView 4.x (5.x uses
no longer used display lists) improve by about 10%-20%.
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Vinson Lee [Thu, 1 Sep 2016 07:20:02 +0000 (00:20 -0700)]
glsl: Add positional argument specifiers.
Fix build with Python < 2.7.
File "./glsl/ir_expression_operation.py", line 360, in get_enum_name
return "ir_{}op_{}".format(("un", "bin", "tri", "quad")[self.num_operands-1], self.name)
ValueError: zero length field name in format
Fixes: e31c72a331b1 ("glsl: Convert tuple into a class")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Roland Scheidegger [Tue, 6 Sep 2016 17:47:14 +0000 (19:47 +0200)]
util: (trivial) add <stdint.h> include to slab.c
should fix "src/util/slab.c:57:13: error: ‘uint8_t’ undeclared"
Jason Ekstrand [Tue, 6 Sep 2016 15:32:19 +0000 (08:32 -0700)]
glsl: Add .gitignore for make check warnings test
Jason Ekstrand [Tue, 6 Sep 2016 03:09:47 +0000 (20:09 -0700)]
anv/pipeline: Lower indirect outputs when EmitNoIndirectOutput is set
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reported-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Rob Herring [Thu, 1 Sep 2016 19:06:23 +0000 (14:06 -0500)]
Android: glsl: add rules to generate ir_expression*.h header files
Recent changes to generate ir_expression*.h header files broke Android
builds. This adds the generation rules. This change is complicated due to
creating a circular dependency between libmesa_glsl, libmesa_nir, and
libmesa_compiler. Normally, we add static libraries so that include paths
are added even if there's no linking dependency. That is the case here.
Instead, we explicitly add the include path using $(MESA_GEN_GLSL_H) to
libmesa_compiler. This in turn requires shuffling the order of make
includes. It also uncovered missing dependency tracking of glsl_parser.h.
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Leo Liu [Mon, 29 Aug 2016 17:43:48 +0000 (13:43 -0400)]
st/omx/dec: enable hevc omx decode support
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Leo Liu [Mon, 29 Aug 2016 17:42:24 +0000 (13:42 -0400)]
st/omx/dec/h265: get the reference list for uvd
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Leo Liu [Tue, 30 Aug 2016 17:09:53 +0000 (13:09 -0400)]
st/omx/dec/h265: add short term reference picture sets
Specified by subclause 7.3.7
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Leo Liu [Mon, 29 Aug 2016 17:30:08 +0000 (13:30 -0400)]
st/omx/dec/h265: add slice header
Specified by subclause 7.3.6.1
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Leo Liu [Mon, 29 Aug 2016 17:29:14 +0000 (13:29 -0400)]
st/omx/dec/h265: add picture parameter sets
Specified by subclause 7.3.2.3
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Leo Liu [Mon, 29 Aug 2016 17:26:55 +0000 (13:26 -0400)]
st/omx/dec/h265: add sequence parameter sets
Specified by subclause 7.3.2.2
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Leo Liu [Mon, 29 Aug 2016 17:09:12 +0000 (13:09 -0400)]
st/omx/dec: add initial omx hevc support
Mainly based on the h264 implementation.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Leo Liu [Thu, 11 Aug 2016 19:20:53 +0000 (15:20 -0400)]
st/omx/dec: set dst rect to match src size
When creating interlaced video buffer, hegith set to "template.height =
align(tmpl->height/ array_size, VL_MACROBLOCK_HEIGHT);", and we use
"template.height *= array_size;" for the buffer height, so it actually
aligned with 32. With progressive video buffer it still aligned with 16,
thus causing different height between interlaced buffer and progressive
buffer for 4K (height=2160), and 720p (height=720).
When transcode the video, this will cause the 16 lines corruption
at the bottom of the encode video.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Marek Olšák [Sun, 28 Aug 2016 09:05:14 +0000 (11:05 +0200)]
gallium: switch drivers to the slab allocator in src/util
Marek Olšák [Sat, 27 Aug 2016 17:52:31 +0000 (19:52 +0200)]
util: import the slab allocator from gallium
There are also some cosmetic changes.
Michel Dänzer [Tue, 6 Sep 2016 02:34:49 +0000 (11:34 +0900)]
loader/dri3: Always use at least two back buffers
This can make a significant difference for performance with some extreme
test cases such as vblank_mode=0 glxgears.
Fixes: 1e3218bc5ba2 ("loader/dri3: Overhaul dri3_update_num_back")
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97549
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Kenneth Graunke [Sat, 3 Sep 2016 17:51:07 +0000 (10:51 -0700)]
glsl: Fix locations of variables in patch qualified interface blocks.
As of commit
d82f8d9772813949d0f5455cd0edad9003be0fb0, we actually
parse and attempt to handle the 'patch' qualifier on interface blocks.
This patch fixes explicit locations for variables in such blocks.
Without it, many program interface query dEQP/CTS tests hit this
assertion in ir_set_program_inouts.cpp
if (is_patch_generic) {
assert(idx >= VARYING_SLOT_PATCH0 && idx < VARYING_SLOT_TESS_MAX);
bitfield = BITFIELD64_BIT(idx - VARYING_SLOT_PATCH0);
}
because the location was incorrectly based on VARYING_SLOT_VAR0.
Note that most of the tests affected currently fail before they hit
this, due to confusion about what the program interface query name
of those resources should be.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Kenneth Graunke [Sat, 3 Sep 2016 06:18:36 +0000 (23:18 -0700)]
mesa: Fix types in _mesa_get_color_read_format().
This is a mesa_format, not a GLenum.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Dave Airlie [Sun, 4 Sep 2016 23:54:07 +0000 (09:54 +1000)]
amd/addrlib: move addrlib from amdgpu winsys to common code
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Dave Airlie [Sun, 4 Sep 2016 23:52:10 +0000 (09:52 +1000)]
gallium/util: move endian detect into a separate file
This just ports the simpler endian detection bits, addrlib
sharing wants this outside gallium.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Dave Airlie [Fri, 2 Sep 2016 07:40:39 +0000 (17:40 +1000)]
radeon: move radeon_family/chip_class defintions to common
This just moves these to a common header file.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Dave Airlie [Fri, 2 Sep 2016 07:09:45 +0000 (17:09 +1000)]
radeonsi: move sid.h/r600d_common.h to a common place.
Step one to merging radv would be to move some files around.
This only adds the include path to r600/radeonsi, because later
we want to avoid having to add it to the generic target paths.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Fri, 26 Aug 2016 16:08:03 +0000 (18:08 +0200)]
gallium/radeon: remove VPORT_ZMIN/ZMAX from init config states
It's part of the viewport state now.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Fri, 26 Aug 2016 15:26:43 +0000 (17:26 +0200)]
gallium/radeon: set VPORT_ZMIN/MAX registers correctly
Calculate depth ranges from viewport states and
pipe_rasterizer_state::clip_halfz.
The evergreend.h change is required to silence a warning.
This fixes this recently updated piglit: arb_depth_clamp/depth-clamp-range
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Fri, 26 Aug 2016 09:53:47 +0000 (11:53 +0200)]
gallium/radeon: unify viewport emission code
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 23 Aug 2016 13:26:01 +0000 (15:26 +0200)]
radeonsi: also do VS_PARTIAL_FLUSH before updating VGT ring pointers
ported from Vulkan
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Thu, 25 Aug 2016 12:08:24 +0000 (14:08 +0200)]
radeonsi: fix variable naming in si_emit_cache_flush
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Wed, 24 Aug 2016 13:32:56 +0000 (15:32 +0200)]
radeonsi: don't emit CS_PARTIAL_FLUSH if compute is not used
for less noise in the HUD
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 23 Aug 2016 13:17:35 +0000 (15:17 +0200)]
radeonsi: add HUD queries for counting VS/PS/CS partial flushes
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 23 Aug 2016 13:07:35 +0000 (15:07 +0200)]
gallium/radeon: rename the num-cs-flushes query to num-ctx-flushes
num-cs-flushes will mean compute shader flushes
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 23 Aug 2016 15:58:22 +0000 (17:58 +0200)]
radeonsi: fix a badly implemented GS bug workaround
Limit it to geometry shaders and Hawaii.
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Mon, 22 Aug 2016 11:45:05 +0000 (13:45 +0200)]
radeonsi: fix texture format reinterpretation with DCC
DCC is limited in how texture formats can be reinterpreted using texture
views. If we get a view format that is incompatible with the initial
texture format with respect to DCC, disable DCC.
There is a new piglit which tests all format combinations.
What works and what doesn't was deduced by looking at the piglit failures.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sat, 20 Aug 2016 14:50:01 +0000 (16:50 +0200)]
radeonsi: fix Gather4 with integer formats
The closed compiler does the same thing.
This fixes: GL45-CTS.texture_gather.*-int-* (18 tests)
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sat, 20 Aug 2016 23:32:22 +0000 (01:32 +0200)]
radeonsi: fix a crash in imageSize for cubemap arrays
Sometimes it was f32, other times it was i32. Now it's always i32.
This fixes:
GL45-CTS.texture_cube_map_array.image_texture_size.texture_size_compute_sh
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Fri, 19 Aug 2016 23:42:09 +0000 (01:42 +0200)]
radeonsi: fix gl_PatchVerticesIn for tessellation evaluation shader
This fixes:
GL45-CTS.tessellation_shader.tessellation_control_to_tessellation_evaluation
.gl_PatchVerticesIn
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Fri, 19 Aug 2016 17:52:14 +0000 (19:52 +0200)]
radeonsi: fix cubemaps viewed as 2D
This fixes: GL43-CTS.texture_view.view_sampling
v2: fix a typo, merge both if statements
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sat, 27 Aug 2016 09:48:14 +0000 (11:48 +0200)]
radeonsi: always use the same function signature for llvm.SI.export
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 21 Aug 2016 10:00:01 +0000 (12:00 +0200)]
radeonsi: return correct eviction stats for NVX_gpu_memory_info
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 21 Aug 2016 10:39:21 +0000 (12:39 +0200)]
gallium/radeon: also eliminate DCC fast clear in resource_get_handle
just do what the comment says
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 21 Aug 2016 10:30:21 +0000 (12:30 +0200)]
gallium/radeon: use the current ctx for CMASK elimination in resource_get_handle
For coherency with the current context.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 21 Aug 2016 10:30:21 +0000 (12:30 +0200)]
gallium/radeon: use the current ctx for DCC decompression in resource_get_handle
For coherency with the current context.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Thu, 18 Aug 2016 14:30:00 +0000 (16:30 +0200)]
gallium/radeon: derive buffer placement and flags only at initialization
Invalidated buffers don't have to go through it.
Split r600_init_resource into r600_init_resource_fields and
r600_alloc_resource.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 21 Aug 2016 14:13:16 +0000 (16:13 +0200)]
radeonsi: set more sampler settings
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Emil Velikov [Mon, 5 Sep 2016 15:13:48 +0000 (16:13 +0100)]
docs: add news item and link release notes for 12.0.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Mon, 5 Sep 2016 15:03:06 +0000 (16:03 +0100)]
docs: add sha256 checksums for 12.0.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
614fb93a6d0246d5592333a1b914ce71a409fcf7)
Emil Velikov [Mon, 5 Sep 2016 11:14:11 +0000 (12:14 +0100)]
docs: add release notes for 12.0.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
2fc6a31f10e908af8f348aba796d0e6b1616b863)
Marek Olšák [Sun, 28 Aug 2016 16:50:19 +0000 (18:50 +0200)]
noop: implement resource_get_handle
X+DRI3 locks up if the returned handle is invalid.
Marek Olšák [Sun, 28 Aug 2016 11:58:16 +0000 (13:58 +0200)]
noop: set missing functions
Marek Olšák [Sun, 28 Aug 2016 11:57:44 +0000 (13:57 +0200)]
noop: simplify some functions
Emil Velikov [Thu, 1 Sep 2016 09:36:44 +0000 (10:36 +0100)]
glx/glvnd: list the strcmp arguments in correct order
Currently, due to the inverse order, strcmp will produce negative result
when the needle is towards the start of the haystack. Thus on the next
iteration(s) we'll end up further towards the end and eventually fail to
locate the entry.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Jason Ekstrand [Sat, 3 Sep 2016 18:57:05 +0000 (11:57 -0700)]
nir/tests: Update the CF tests to not assume fake edges
In
aad4f1550, we removed the concept of "fake" edges from NIR. Now, if you
have a block at the end of an infinite loop it really has no predecessors.
This updates the unit tests to match.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97587
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Ilia Mirkin [Sun, 4 Sep 2016 22:21:29 +0000 (18:21 -0400)]
gk110/ir: fix quadop dall emission
We recently starting to always emit the NDV (== dall) bit for quadops.
However it was folded into the wrong code word.
Fixes: e0a067ed48 (nv50/ir: always emit the NDV bit for OP_QUADOP)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
Mauro Rossi [Sun, 4 Sep 2016 00:00:24 +0000 (02:00 +0200)]
android: intel: fix include paths in new "common" library
Fixes building error in libmesa_intel_common static library
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Ilia Mirkin [Wed, 31 Aug 2016 02:42:24 +0000 (22:42 -0400)]
a3xx: use window scissor to simulate viewport xy clip
Unfortunately a3xx does not have a separate disable for depth clipping,
so when depth clamp is enabled, we disable the whole 3d clipper logic.
This in turn also gets rid of the xy clip that it would normally do.
When we detect this would happen, instead we integrate the viewport into
the window scissor. This may have slightly different behavior around
wide points, but it's unlikely that anything depends on this.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Ilia Mirkin [Sat, 20 Aug 2016 04:14:43 +0000 (00:14 -0400)]
a3xx: make use of software clipping when hw can't handle it
The hw clipper only handles up to 6 UCPs. If there are more than 6 UCPs,
or a clip vertex, or clip distances are in use, then we must use the
fallback discard-based clipping from the frag shader.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Ilia Mirkin [Mon, 15 Aug 2016 03:58:18 +0000 (23:58 -0400)]
a3xx: make sure to actually clamp depth as requested
We were previously ... not clamping. I guess this meant that everything
got clamped to 1/0, which was enough to pass the existing tests. Or
perhaps the clamping would only happen to the rasterized depth value and
not the frag shader's output depth value.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Karol Herbst [Sat, 13 Aug 2016 09:54:52 +0000 (11:54 +0200)]
nvc0/ir: allow min/max instructions to be dual-issued in pairs
changes for GpuTest /test=pixmark_piano /benchmark /no_scorebox /msaa=0
/benchmark_duration_ms=60000 /width=1024 /height=640:
inst_executed: 1.03G
inst_issued1: 614M -> 580M
inst_issued2: 213M -> 230M
score: 1021 -> 1030
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Jason Ekstrand [Tue, 23 Aug 2016 00:13:51 +0000 (17:13 -0700)]
anv: Move cmd_buffer_config_l3 into anv_cmd_buffer.c
This is the only remaining part of genX_l3.c and there's really no good
reason for it to be in its own file.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Jason Ekstrand [Tue, 23 Aug 2016 00:13:27 +0000 (17:13 -0700)]
anv/cmd_buffer: Move emit_lri and emit_lrm higher up
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Jason Ekstrand [Mon, 22 Aug 2016 23:56:48 +0000 (16:56 -0700)]
anv: Refactor pipeline l3 config setup
Now that we're using gen_l3_config.c, we no longer have one set of l3
config functions per gen and we can simplify a bit. Also, we know that
only compute uses SLM so we don't need to look for it in all of the stages.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Jason Ekstrand [Mon, 22 Aug 2016 23:39:05 +0000 (16:39 -0700)]
anv: Leverage the shared L3$ config code
When Jordan first implement L3$ configuration for Vulkan, he copied+pasted
from the GL driver because we had no good place to share it. Now that we
have src/intel/common, we should be sharing these tables.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Jason Ekstrand [Mon, 22 Aug 2016 23:09:32 +0000 (16:09 -0700)]
intel: Pull the guts of gen7_l3_state.c into a shared helper
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Jason Ekstrand [Thu, 25 Aug 2016 23:22:58 +0000 (16:22 -0700)]
intel: Rename brw_get_device_name/info to gen_get_device_name/info
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Jason Ekstrand [Mon, 22 Aug 2016 22:01:08 +0000 (15:01 -0700)]
intel: s/brw_device_info/gen_device_info/
Generated by:
sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/**/*.c
sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/**/*.h
sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.c
sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.cpp
sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.h
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Jason Ekstrand [Mon, 22 Aug 2016 21:47:55 +0000 (14:47 -0700)]
intel: Add a new "common" library for more code sharing
The first thing to go in this new library is brw_device_info.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Mauro Rossi [Sat, 3 Sep 2016 09:00:25 +0000 (11:00 +0200)]
intel/blorp: fix typo in android makefile
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Timothy Arceri [Sat, 3 Sep 2016 10:30:19 +0000 (20:30 +1000)]
nir: remove unused variable
This was let over from
aad4f15506c
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Connor Abbott [Sat, 3 Sep 2016 04:49:58 +0000 (00:49 -0400)]
nir: remove some fields from nir_shader_compiler_options
I accidentally added these with
0dc4cab. Oops!
Connor Abbott [Fri, 2 Sep 2016 23:07:57 +0000 (19:07 -0400)]
nir: fix bug with moves in nir_opt_remove_phis()
In
144cbf8 ("nir: Make nir_opt_remove_phis see through moves."), Ken
made nir_opt_remove_phis able to coalesce phi nodes whose sources are
all moves with the same swizzle. However, he didn't add the logic
necessary for handling the fact that the phi may now have multiple
different sources, even though the sources point to the same thing. For
example, if we had something like:
if (...)
a1 = b.yx;
else
a2 = b.yx;
a = phi(a1, a2)
... = a
then we would rewrite it to
if (...)
a1 = b.yx;
else
a2 = b.yx;
... = a1
by picking a random phi source, which in this case is invalid because
the source doesn't dominate the phi. Instead, we need to change it to:
if (...)
a1 = b.yx;
else
a2 = b.yx;
a3 = b.yx;
... = a3;
Fixes 12 CTS tests:
ES31-CTS.functional.tessellation.invariance.outer_edge_symmetry.quads*
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Connor Abbott [Fri, 2 Sep 2016 23:06:52 +0000 (19:06 -0400)]
nir: add nir_after_phis() cursor helper
And re-implement nir_after_cf_node_and_phis() using it.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Ilia Mirkin [Sun, 28 Aug 2016 19:04:00 +0000 (15:04 -0400)]
glsl: expose max atomic counter/buffer consts for tess in ES 3.2
Curiously OES/EXT_tessellation_shader leave these out, while ES 3.2 adds
them in.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Ilia Mirkin [Sun, 28 Aug 2016 18:52:10 +0000 (14:52 -0400)]
mapi: don't forget to expose GetPointerv in GL ES 3.2
I left this out of my previous commit that went around enabling all of
the other ES 3.2 entrypoints.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Ilia Mirkin [Mon, 29 Aug 2016 00:39:07 +0000 (20:39 -0400)]
main: add KHR_robustness to ES 3.2 extension requirements
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Ilia Mirkin [Sat, 3 Sep 2016 03:57:06 +0000 (23:57 -0400)]
nv50,nvc0: respect render condition enable flag when clearing rt/zs
This is a newly added flag. We always pass false into it from
nv50_clear_texture, but other callers may want to respect the render
condition. (And the functions were originally spec'd to respect it.)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Karol Herbst [Sat, 13 Aug 2016 09:54:45 +0000 (11:54 +0200)]
nvc0/ir: don't dual-issue ops that depend or interfere with each other
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
[imirkin: rewrite to split up the helpers and move more logic to target]
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Jason Ekstrand [Wed, 31 Aug 2016 21:45:08 +0000 (14:45 -0700)]
nir: Remove fake edges in the CF handling code
When NIR was first introduced, Connor added this fake-edge hack to work
around issues related to unreachable blocks. Thanks to GLSL IR's jump
lowering code, the only unreachable code you can have is a block after an
infinite loop. With SPIR-V, we didn't have the jump lowering code so we
could also end up with the "if (...) { break; } else { continue; }" case
which generates an unreachable block after the if. Because of this, most
of NIR had to be fixed up for handling unreachable blocks. The only
remaining case of not handling unreachable blocks was specifically the
block-after-infinite-loop case in dead_cf which was fixed by the previous
commit. We can now delete the fake edge hack.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Jason Ekstrand [Wed, 31 Aug 2016 23:35:21 +0000 (16:35 -0700)]
nir/dead_cf: Don't crash on unreachable after-loop blocks
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Samuel Pitoiset [Wed, 31 Aug 2016 20:52:47 +0000 (22:52 +0200)]
nvc0: reduce the initial code segment size to 512KB
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Samuel Pitoiset [Wed, 31 Aug 2016 20:52:46 +0000 (22:52 +0200)]
nvc0: allow to resize the code segment dynamically
When an application uses a ton of shaders, we need to evict them
when the code segment is full but this is not really a good solution
if monster shaders are used because code eviction will happen a lot.
To avoid this, it seems better to dynamically resize the code
segment area after each eviction. The maximum size is arbitrary
fixed to 8MB which should be enough.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Samuel Pitoiset [Wed, 31 Aug 2016 20:52:45 +0000 (22:52 +0200)]
nvc0: add a new bin for the code segment
To avoid the bins list to grow up indefinitely when the code segment
size will be bumped, we need to separate that bin from the SCREEN
one because it contains other resources like the uniform bo.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Samuel Pitoiset [Wed, 31 Aug 2016 20:52:44 +0000 (22:52 +0200)]
nvc0: add nvc0_screen_resize_text_area() helper
This function will be helpful for resizing the code segment
area when we need to evict all shaders.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Samuel Pitoiset [Wed, 31 Aug 2016 20:52:43 +0000 (22:52 +0200)]
nvc0: re-upload currently bound shaders after code eviction
This fixes a very old issue which happens when the code segment
size is full. A bunch of real applications like Tomb Raider,
F1 2015, Elemental, hit that issue because they use a ton of shaders.
In this case, all shaders are evicted (for freeing space) but all
currently bound shaders also need to be re-uploaded and SP_START_ID
have to be updated accordingly.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Samuel Pitoiset [Wed, 31 Aug 2016 20:52:42 +0000 (22:52 +0200)]
nvc0: refactor the program upload process
This refactoring will help for fixing the "out of code space"
eviction issue because we will need to reupload the code for
all currently bound shaders but it's slightly different than
uploading a new fresh code.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Jordan Justen [Fri, 18 Oct 2013 22:03:09 +0000 (15:03 -0700)]
i965: fix noop_scissor range issue on width/height
If scissor X or Y was set to a negative value then the previous
code might have indicated noop scissors when the scissor range
actually was masking a portion of the framebuffer.
Since fb->_Xmin, _Xmax, _Ymin and _Ymax take scissors into
account, we can use these to test for a noop scissor.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Kenneth Graunke [Wed, 31 Aug 2016 05:51:25 +0000 (22:51 -0700)]
glsl: Only force varyings to be flat when varying packing.
Varying packing would like to mark certain variables as flat.
This works as long as both sides of the interfaces are changed
accordingly. However, with SSO, we disable varying packing on
the outermost stages. We also disable varying packing for
certain tessellation stages.
With SSO, we operate on the producer and consumer separately.
Checks based on the consumer stage and variable are risky, and
can easily lead to altering one half of the interface between
stages, breaking SSO pipeline IO validation.
Just stop monkeying around with interpolation modes unless
required for varying packing. There's no point. This also
disables it in unsafe SSO cases.
Fixes CTS tests:
*.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_MaxPatchVertices_Position_PointSize
Also fixes Piglit's spec/oes_geometry_shader/sso_validation:
- user-defined-gs-input-not-in-block.shader_test
- user-defined-gs-input-in-block.shader_test
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Kenneth Graunke [Wed, 31 Aug 2016 07:16:24 +0000 (00:16 -0700)]
glsl: Reject TCS/TES input arrays not sized to gl_MaxPatchVertices.
We handled the unsized case, implicitly sizing arrays to the value
of gl_MaxPatchVertices. But if a size was present, we failed to
raise a compile error if it wasn't the value of gl_MaxPatchVertices.
Fixes CTS tests:
*.tessellation_shader.compilation_and_linking_errors.
{tc,te}_invalid_array_size_used_for_input_blocks
Piglit's tcs-input-read-nonconst-* tests have recently been fixed.
This patch will break older copies of those tests, but the latest
should continue working. Update to Piglit
75819c13af2ed5.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Frank Binns [Thu, 4 Aug 2016 14:15:40 +0000 (15:15 +0100)]
wayland-drm: add missing NULL check
Although malloc is unlikely to fail check its return value nevertheless.
Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Frank Binns [Thu, 4 Aug 2016 14:15:23 +0000 (15:15 +0100)]
loader: fix sysfs uevent file parsing
When trying to get a device name for an fd using sysfs, it would always fail
as it was expecting key/value pairs to be delimited by '\0', which is not the
case.
Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Frank Binns [Fri, 17 Jun 2016 17:41:22 +0000 (18:41 +0100)]
egl: only store device name when Wayland support is built
The device name is only needed for WL_bind_wayland_display so make this clear
by only storing the device name when Wayland support is built.
Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Lionel Landwerlin [Fri, 19 Aug 2016 23:38:05 +0000 (00:38 +0100)]
isl: round format alignment to nearest power of 2
A few inline asserts in anv assume alignments are power of 2, but with
formats like R8G8B8 we have odd alignments.
v2: round up to power of 2 (Ilia)
v3: reuse util_next_power_of_two() from gallium/aux/util/u_math.h (Ilia)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Thomas Hellstrom [Thu, 26 May 2016 09:16:28 +0000 (11:16 +0200)]
gallium/postprocess: Fix resource freeing
The code was triggering asserts in DEBUG builds of the SVGA driver since
the reference count of the resource was never decremented before destroy.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Ilia Mirkin [Sat, 27 Aug 2016 21:47:37 +0000 (17:47 -0400)]
st/mesa: expose OES_geometry_shader and OES_texture_cube_map_array
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Eric Engestrom [Tue, 30 Aug 2016 20:02:18 +0000 (21:02 +0100)]
Introduce .editorconfig
A few weeks ago, Jose Fonseca suggested [0] we use .editorconfig files
to try and enforce the formatting of the code, to which Michel Dänzer
suggested [1] we start by importing the existing .dir-locals.el
settings. The first draft was discussed in the RFC [2].
These .editorconfig are a first step, one that has the advantage of
requiring little to no intervention from the devs once the settings
files are in place, but the settings are very limited. This does have
the advantage of applying while the code is being written.
This doesn't replace the need for more comprehensive formatting tools
such as clang-format & clang-tidy, but those reformat the code after
the fact.
[0] https://lists.freedesktop.org/archives/mesa-dev/2016-June/121545.html
[1] https://lists.freedesktop.org/archives/mesa-dev/2016-June/121639.html
[2] https://lists.freedesktop.org/archives/mesa-dev/2016-July/123431.html
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Eric Anholt [Mon, 29 Aug 2016 18:23:35 +0000 (11:23 -0700)]
vc4: Add missing break statement.
This opcode isn't used yet, so it didn't affect anything. Caught by
Coverity, reported to me by imirkin.
Brian Paul [Wed, 31 Aug 2016 16:03:53 +0000 (10:03 -0600)]
gallium/docs: clarify render_condition_enabled parameter to clear functions
If false, it means do the clear unconditionally.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Jason Ekstrand [Wed, 31 Aug 2016 20:45:27 +0000 (13:45 -0700)]
mesa: Add some more .gitignore
Matt Turner [Mon, 29 Aug 2016 22:57:41 +0000 (15:57 -0700)]
i965: Pass start_offset to brw_set_uip_jip().
Without this, we would pass over the instructions in the SIMD8 program
(which is located earlier in the buffer) when brw_set_uip_jip() is
called to handle the SIMD16 program.
The assertion about compacted control flow was bogus: halt, cont, break
cannot be compacted because they have both JIP and UIP. Instead, we
should never see a compacted instruction in this code at all.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Kenneth Graunke [Tue, 30 Aug 2016 20:16:02 +0000 (13:16 -0700)]
i965: Merge gen7_clip_state atom into gen6_clip_state atom.
The original motivation was that gen6_clip_state ignored _NEW_POLYGON
as it didn't care about early culling. The only other change was that
Gen6 ignored BRW_NEW_TES_PROG_DATA as it doesn't have tessellation
shaders, but listening to this is harmless as it'll never be signalled.
Now that we've added _NEW_POLYGON for is_drawing_lines/points, we can
merge the two as the distinction is meaningless.
This actually fixes a bug, though: Gen8+ was using the gen6_clip_state
atom because it doesn't care about early culling, but it also needs
BRW_NEW_TES_PROG_DATA, which was missing.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>