Jason Ekstrand [Fri, 12 Jul 2019 20:26:03 +0000 (15:26 -0500)]
nir/lower_system_values: Support lowering more intrinsics
Instead of only lowering system from variables, lower most to intrinsics
and let the lowering framework immediately lower the intrinsic. This
will result in a bit more instruction churn but it means that NIR code
builders can just use intrinsics instead of everything having to go
through variables.
Reviewed-by: Eric Anholt <eric@anholt.net>
Jason Ekstrand [Thu, 11 Jul 2019 19:01:20 +0000 (14:01 -0500)]
nir/lower_system_values: Drop the context-aware builder functions
Instead of having context-aware builder functions, just provide lowering
for the system value intrinsics and let nir_shader_lower_instructions
handle the recursion for us. This makes everything a bit simpler and
means that the lowering can also be used if something comes in as a
system value intrinsic rather than a load_deref.
Reviewed-by: Eric Anholt <eric@anholt.net>
Jason Ekstrand [Thu, 11 Jul 2019 18:30:03 +0000 (13:30 -0500)]
nir/lower_system_values: Use the new generic NIR lowering helpers
Reviewed-by: Eric Anholt <eric@anholt.net>
Jason Ekstrand [Thu, 11 Jul 2019 18:04:05 +0000 (13:04 -0500)]
nir/lower_subgroups: Use the new generic NIR lowering helpers
Reviewed-by: Eric Anholt <eric@anholt.net>
Jason Ekstrand [Thu, 11 Jul 2019 18:00:42 +0000 (13:00 -0500)]
nir: Add some generic helpers for writing lowering passes
Reviewed-by: Eric Anholt <eric@anholt.net>
Jason Ekstrand [Thu, 11 Jul 2019 20:05:27 +0000 (15:05 -0500)]
nir: Add a helper for fetching the SSA def from an instruction
Reviewed-by: Eric Anholt <eric@anholt.net>
Tomeu Vizoso [Fri, 12 Jul 2019 14:42:52 +0000 (16:42 +0200)]
pandecode: Add more addresses to trace
When debugging, we're given the fault_pointer unresolved, so it is
helpful to have more context in the decode.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tomeu Vizoso [Thu, 11 Jul 2019 06:06:41 +0000 (08:06 +0200)]
panfrost: Use 64-bit descriptors globally
Midgard supports two modes of operation, 32-bit mode and 64-bit mode.
The GPU is natively 64-bit, but job descriptors can be submitted in
32-bit mode. Among other changes, 32-bit mode shortens pointer sizes to
use 32-bit pointers rather than the full 64-bit range.
The blob decides which mode to use based on the CPU bitness, so an armhf
system uses 32-bit descriptors and an aarch64 system uses 64-bit
descriptors. For a while, we mimicked this, bu inevitably this caused
the 32-bit support to lag behind as our reference platform is 64-bit.
To combat the code staleness, we traced an older GPU paired with a 64-bit
CPU (the Midgard T720 on-board the sunxi H64). From there, we could tell
which fields were really about hardware and which fields were simply
reflections of the descriptor bitness.
From there, we decided to remove support for 32-bit descriptors
entirely, using 64-bit descriptors unconditionally. There is minimal
performance penalty for this in practice, and it allows us to unify
these disparate code paths. This fixes:
- T860 + armhf
- T820 + armhf
- T760 + aarch64
And will help bringup of 1st/2nd generation Midgard regardless of CPU.
[Work done by Tomeu. Commit message written by Alyssa.]
v2: Add comments preserving information about the old behaviour for
future reference. Fix a compiler warning. (Alyssa)
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Jason Ekstrand [Mon, 15 Jul 2019 22:14:26 +0000 (17:14 -0500)]
anv: Account for dynamic stencil write disables in the PMA fix
In
6ce8592836b8 we started looking at the dynamic stencil state and
disabling stencil writes when the stencil mask is zero. Unfortunately,
we never updated the PMA fix code accordingly so 3DSTATE_WM_DEPTH_STENCIL
and the PMA fix were getting out-of-sync causing hangs.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109203
Fixes: 6ce8592836 "anv: Disable stencil writes when both write..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Alyssa Rosenzweig [Mon, 15 Jul 2019 21:15:24 +0000 (14:15 -0700)]
panfrost: Implement opportunistic AFBC
Rather than hardcoding a BO layout at creation-time, we implement the
ability to hint layouts at various points in a BO's lifetime,
potentially reallocating and switching layouts if it's heuristically
deemed useful to do so.
In this patch, we add a simple hinting implementation, opportunistically
compressing FBOs.
Support is hidden behind PAN_MESA_DEBUG=afbc as the implementation is
incomplete (software access to AFBC is unimplemented at the moment) and
therefore would regress significantly.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Mon, 15 Jul 2019 22:35:30 +0000 (15:35 -0700)]
panfrost/mfbd: Zero out framebuffer_stride
We don't know what this is, so let's not pretend we do.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Mon, 15 Jul 2019 22:34:50 +0000 (15:34 -0700)]
panfrost: AFBC buffers must be cache-line aligned
Fixes a DATA_INVALID_FAULT when AFBC is paried with mipmapping.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Mon, 15 Jul 2019 22:16:08 +0000 (15:16 -0700)]
panfrost: Add Z/S and MRT BOs to the job
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Mon, 15 Jul 2019 21:59:03 +0000 (14:59 -0700)]
panfrost: Set usage2 during draw, not CSO
It can change from a layout switch.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Sergii Romantsov [Mon, 27 May 2019 13:45:35 +0000 (16:45 +0300)]
meta: memory leak of CopyPixels usage
Meta of CopyPixel generates a buffer object
but does not free it on cleanup.
Fixes: 37d11b13ce1d (meta: Don't pollute the buffer object namespace in _mesa_meta_setup_vertex_objects)
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Samuel Pitoiset [Fri, 12 Jul 2019 16:12:36 +0000 (18:12 +0200)]
radv: add radv_emit_streamout_{begin,end} helpers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Fri, 12 Jul 2019 16:12:35 +0000 (18:12 +0200)]
radv: pass output values to radv_emit_stream_output()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Fri, 12 Jul 2019 16:12:34 +0000 (18:12 +0200)]
radv: allow to select DST_SEL with RELEASE_MEM
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Fri, 12 Jul 2019 16:12:33 +0000 (18:12 +0200)]
radv: allow to emit PS_DONE/CS_DONE with RELEASE_MEM
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Fri, 12 Jul 2019 16:12:32 +0000 (18:12 +0200)]
radv: restore an assertion in handle_vs_outputs()
The NGG GS epilogue no longers call that function so the assertion
is just useless now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Fri, 12 Jul 2019 16:12:31 +0000 (18:12 +0200)]
radv/gfx10: emit ES outputs of TES when it's not NGG
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Tue, 16 Jul 2019 07:34:40 +0000 (09:34 +0200)]
radv: update LATE_ALLOC_VS.LIMIT
Mirror RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Tue, 16 Jul 2019 07:34:39 +0000 (09:34 +0200)]
radv/gfx10: support pixel shaders without exports
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Tue, 16 Jul 2019 06:37:32 +0000 (08:37 +0200)]
radv: fix gathering clip/cull distance masks for GS
For NGG, the driver relies on the VS outinfo struct.
This fixes
dEQP-VK.clipping.user_defined.clip_*_vert_tess_geom_*
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Tue, 16 Jul 2019 07:37:56 +0000 (09:37 +0200)]
Revert "radv/gfx10: don't set array pitch field on images"
It introduces too many regressions.
This reverts commit
6d50dcd80fc120fdabcd57ef576f3e45ea2724e4.
Iago Toral Quiroga [Fri, 12 Jul 2019 09:06:22 +0000 (11:06 +0200)]
v3d: flag dirty state when binding new sampler states
We emit code to saturate texture coordinates when using clamp wrapping
mode so if we don't flag the dirty state here we don't get to recompile
the shaders when the wrapping mode changes.
v2:
- Do the same when setting sampler views (Eric)
- Use a switch statement instead of an if ladder.
- Swap the shader stage assertion with an unreachable.
Fixes:
spec/!opengl 1.1/texwrap 1d bordercolor/gl_rgba8, border color only
spec/!opengl 1.1/texwrap 1d proj bordercolor/gl_rgba8, projected, border color only
spec/!opengl 1.1/texwrap 2d bordercolor/gl_rgba8, border color only
spec/!opengl 1.1/texwrap 2d proj bordercolor/gl_rgba8, projected, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_alpha12, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_alpha16, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_alpha4, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_alpha8, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_intensity8, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_luminance4_alpha4, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_luminance6_alpha2, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_luminance8, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_luminance8_alpha8, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_r3_g3_b2, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_rgb10, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_rgb10_a2, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_rgb4, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_rgb5, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_rgb5_a1, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_rgb8, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_rgba4, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_rgba8, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_alpha12, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_alpha16, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_alpha4, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_alpha8, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_intensity8, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_luminance4_alpha4, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_luminance6_alpha2, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_luminance8, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_luminance8_alpha8, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_r3_g3_b2, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_rgb10, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_rgb10_a2, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_rgb4, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_rgb5, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_rgb5_a1, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_rgb8, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_rgba4, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_rgba8, border color only
spec/!opengl 1.2/texwrap 3d bordercolor/gl_rgba8, border color only
spec/!opengl 1.2/texwrap 3d proj bordercolor/gl_rgba8, projected, border color only
spec/arb_es2_compatibility/texwrap formats bordercolor-swizzled/gl_rgb565, swizzled, border color only
spec/arb_es2_compatibility/texwrap formats bordercolor/gl_rgb565, border color only
spec/arb_texture_compression/texwrap formats bordercolor-swizzled/gl_compressed_alpha, swizzled, border color only
spec/arb_texture_compression/texwrap formats bordercolor-swizzled/gl_compressed_luminance_alpha, swizzled, border color only
spec/arb_texture_compression/texwrap formats bordercolor-swizzled/gl_compressed_rgb, swizzled, border color only
spec/arb_texture_compression/texwrap formats bordercolor/gl_compressed_alpha, border color only
spec/arb_texture_compression/texwrap formats bordercolor/gl_compressed_luminance_alpha, border color only
spec/arb_texture_compression/texwrap formats bordercolor/gl_compressed_rgb, border color only
spec/arb_texture_float/texwrap formats bordercolor-swizzled/gl_alpha16f_arb, swizzled, border color only
spec/arb_texture_float/texwrap formats bordercolor-swizzled/gl_intensity16f_arb, swizzled, border color only
spec/arb_texture_float/texwrap formats bordercolor-swizzled/gl_luminance16f_arb, swizzled, border color only
spec/arb_texture_float/texwrap formats bordercolor-swizzled/gl_luminance_alpha16f_arb, swizzled, border color only
spec/arb_texture_float/texwrap formats bordercolor-swizzled/gl_rgb16f, swizzled, border color only
spec/arb_texture_float/texwrap formats bordercolor-swizzled/gl_rgba16f, swizzled, border color only
spec/arb_texture_float/texwrap formats bordercolor/gl_alpha16f_arb, border color only
spec/arb_texture_float/texwrap formats bordercolor/gl_intensity16f_arb, border color only
spec/arb_texture_float/texwrap formats bordercolor/gl_luminance16f_arb, border color only
spec/arb_texture_float/texwrap formats bordercolor/gl_luminance_alpha16f_arb, border color only
spec/arb_texture_float/texwrap formats bordercolor/gl_rgb16f, border color only
spec/arb_texture_float/texwrap formats bordercolor/gl_rgba16f, border color only
spec/arb_texture_rectangle/texwrap rect bordercolor/gl_rgba8, border color only
spec/arb_texture_rectangle/texwrap rect proj bordercolor/gl_rgba8, projected, border color only
spec/arb_texture_rg/texwrap formats bordercolor-swizzled/gl_r8, swizzled, border color only
spec/arb_texture_rg/texwrap formats bordercolor-swizzled/gl_rg8, swizzled, border color only
spec/arb_texture_rg/texwrap formats bordercolor/gl_r8, border color only
spec/arb_texture_rg/texwrap formats bordercolor/gl_rg8, border color only
spec/arb_texture_rg/texwrap formats-float bordercolor-swizzled/gl_r16f, swizzled, border color only
spec/arb_texture_rg/texwrap formats-float bordercolor-swizzled/gl_rg16f, swizzled, border color only
spec/arb_texture_rg/texwrap formats-float bordercolor/gl_r16f, border color only
spec/arb_texture_rg/texwrap formats-float bordercolor/gl_rg16f, border color only
spec/ext_packed_float/texwrap formats bordercolor-swizzled/gl_r11f_g11f_b10f, swizzled, border color only
spec/ext_packed_float/texwrap formats bordercolor/gl_r11f_g11f_b10f, border color only
spec/ext_texture_shared_exponent/texwrap formats bordercolor-swizzled/gl_rgb9_e5, swizzled, border color only
spec/ext_texture_shared_exponent/texwrap formats bordercolor/gl_rgb9_e5, border color only
spec/ext_texture_snorm/texwrap formats bordercolor-swizzled/gl_alpha8_snorm, swizzled, border color only
spec/ext_texture_snorm/texwrap formats bordercolor-swizzled/gl_intensity8_snorm, swizzled, border color only
spec/ext_texture_snorm/texwrap formats bordercolor-swizzled/gl_luminance8_alpha8_snorm, swizzled, border color only
spec/ext_texture_snorm/texwrap formats bordercolor-swizzled/gl_luminance8_snorm, swizzled, border color only
spec/ext_texture_snorm/texwrap formats bordercolor-swizzled/gl_r8_snorm, swizzled, border color only
spec/ext_texture_snorm/texwrap formats bordercolor-swizzled/gl_rg8_snorm, swizzled, border color only
spec/ext_texture_snorm/texwrap formats bordercolor-swizzled/gl_rgb8_snorm, swizzled, border color only
spec/ext_texture_snorm/texwrap formats bordercolor-swizzled/gl_rgba8_snorm, swizzled, border color only
spec/ext_texture_snorm/texwrap formats bordercolor/gl_alpha8_snorm, border color only
spec/ext_texture_snorm/texwrap formats bordercolor/gl_intensity8_snorm, border color only
spec/ext_texture_snorm/texwrap formats bordercolor/gl_luminance8_alpha8_snorm, border color only
spec/ext_texture_snorm/texwrap formats bordercolor/gl_luminance8_snorm, border color only
spec/ext_texture_snorm/texwrap formats bordercolor/gl_r8_snorm, border color only
spec/ext_texture_snorm/texwrap formats bordercolor/gl_rg8_snorm, border color only
spec/ext_texture_snorm/texwrap formats bordercolor/gl_rgb8_snorm, border color only
spec/ext_texture_snorm/texwrap formats bordercolor/gl_rgba8_snorm, border color only
spec/ext_texture_srgb/texwrap formats bordercolor-swizzled/gl_sluminance8, swizzled, border color only
spec/ext_texture_srgb/texwrap formats bordercolor-swizzled/gl_sluminance8_alpha8, swizzled, border color only
spec/ext_texture_srgb/texwrap formats bordercolor-swizzled/gl_srgb8, swizzled, border color only
spec/ext_texture_srgb/texwrap formats bordercolor-swizzled/gl_srgb8_alpha8, swizzled, border color only
spec/ext_texture_srgb/texwrap formats bordercolor/gl_sluminance8, border color only
spec/ext_texture_srgb/texwrap formats bordercolor/gl_sluminance8_alpha8, border color only
spec/ext_texture_srgb/texwrap formats bordercolor/gl_srgb8, border color only
spec/ext_texture_srgb/texwrap formats bordercolor/gl_srgb8_alpha8, border color only
Reviewed-by: Eric Anholt <eric@anholt.net>
Samuel Pitoiset [Mon, 15 Jul 2019 08:44:53 +0000 (10:44 +0200)]
radv/gfx10: add missing conversions for 16-bit exports
This fixes
dEQP-VK.spirv_assembly.instruction.graphics.16bit_storage.input_output_*
Found with RADV_DEBUG=checkir
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Mon, 15 Jul 2019 06:53:11 +0000 (08:53 +0200)]
radv: remove unused code in radv_export_param()
It was hack for geometry shaders.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Dave Airlie [Mon, 15 Jul 2019 23:23:15 +0000 (16:23 -0700)]
radv/gfx10: don't set array pitch field on images
Setting this seems to be broken, amdvlk only sets it for quilted
textures which I'm not sure what those are.
Fixes dEQP-VK.glsl.texture_functions.query.texturesize*3d*
Fixes: bf11f1c3a47 ("radv/gfx10: add gfx10_make_texture_descriptor")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Vinson Lee [Mon, 15 Jul 2019 05:58:33 +0000 (22:58 -0700)]
lima/ppir: Fix assert condition in ppir_codegen_encode_branch.
Fixes: af0de6b91c0b ("lima/ppir: implement discard and discard_if")
Reported-by: Coverity Scan
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Eric Anholt [Fri, 12 Jul 2019 20:36:51 +0000 (13:36 -0700)]
docs: Tell people how to easily generate the Fixes lines.
v2: Include '-s' to suppress the diff.
v3: use the git config command (Ken), use < (Eric)
Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)
Acked-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Caio Marcelo de Oliveira Filho [Wed, 3 Jul 2019 20:14:33 +0000 (13:14 -0700)]
spirv: Ignore ArrayStride for storage classes that should not use it
The stride was already overriden when using
lower_workgroup_access_to_offsets, so elaborate a bit the commentary
there.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Caio Marcelo de Oliveira Filho [Wed, 3 Jul 2019 19:47:53 +0000 (12:47 -0700)]
spirv: Fix stride calculation when lowering Workgroup to offsets
Use alignment to calculate the stride associated with the pointer
types. That stride is used when the pointers are casted to arrays.
Note that size alone is not sufficient, e.g. struct { vec2 a; vec1 b;
} will have element an element size of 12 bytes, but the stride needs
to be 16 bytes to respect the 8 byte alignment.
Fixes: 050eb6389a8 "spirv: Ignore ArrayStride in OpPtrAccessChain for Workgroup"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Alyssa Rosenzweig [Mon, 15 Jul 2019 18:05:04 +0000 (11:05 -0700)]
panfrost/ci: Blacklist flush finish tests
We don't implement batch splitting quite yet which is necessary for the
ludicrous number of draw calls these tests invoke. Blacklist them for
now.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Mon, 15 Jul 2019 19:05:48 +0000 (12:05 -0700)]
panfrost: Don't leak oversized transient allocations
When we allocate them, we allocate with two references accidentally,
causing them to leak uncontrollably.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Mon, 15 Jul 2019 16:17:12 +0000 (09:17 -0700)]
panfrost: Implement panfrost_bo_cache_evict_all
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Mon, 15 Jul 2019 16:08:32 +0000 (09:08 -0700)]
panfrost: Implement panfrost_bo_cache_get
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Mon, 15 Jul 2019 16:00:54 +0000 (09:00 -0700)]
panfrost: Implement panfrost_bo_cache_put
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Mon, 15 Jul 2019 15:51:11 +0000 (08:51 -0700)]
panfrost: Add pan_bucket helper
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Mon, 15 Jul 2019 15:47:59 +0000 (08:47 -0700)]
panfrost: Implement pan_bucket_index helper
We'll use this whenever we need to lookup a bucket.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Mon, 15 Jul 2019 15:36:19 +0000 (08:36 -0700)]
panfrost: Add BO cache data structure
Linked list of panfrost_bo* nested inside an array of buckets.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Mon, 15 Jul 2019 15:27:57 +0000 (08:27 -0700)]
panfrost: Describe BO cache architecture
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Mon, 15 Jul 2019 15:22:33 +0000 (08:22 -0700)]
panfrost: Stub out panfrost_bo_cache_evict
This destructor will be used to legitimately free the BOs, now that a BO
free with cacheable=0 is only a "fake" free.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Mon, 15 Jul 2019 15:19:53 +0000 (08:19 -0700)]
panfrost: Stub out panfrost_bo_cache_put
..so we can intercept the BO free.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Mon, 15 Jul 2019 15:13:10 +0000 (08:13 -0700)]
panfrost: Stub out panfrost_bo_cache_get
We will use this function to fetch cached BOs instead of freshly
allocating them.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Mon, 15 Jul 2019 18:37:42 +0000 (11:37 -0700)]
panfrost: Don't leak the blend CSO hash table
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Mon, 15 Jul 2019 18:32:23 +0000 (11:32 -0700)]
panfrost: Cleanup after scoreboarding
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Mon, 15 Jul 2019 18:30:35 +0000 (11:30 -0700)]
panfrost: Allocate UBOs on the stack, not the heap
Saves a call to calloc (the maximum size is small and known at
compile-time) and fixes a leak.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Jason Ekstrand [Mon, 15 Jul 2019 15:31:49 +0000 (10:31 -0500)]
nir,intel: Add support for lowering 64-bit nir_opt_extract_*
We need this when doing full software 64-bit emulation.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110309
Fixes: cbad201c2b3 "nir/algebraic: Add missing 64-bit extract_[iu]8..."
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Jason Ekstrand [Wed, 10 Jul 2019 20:14:42 +0000 (15:14 -0500)]
nir/opt_if: Clean up single-src phis in opt_if_loop_terminator
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111071
Fixes: 2a74296f24ba "nir: add opt_if_loop_terminator()"
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Pierre-Eric Pelloux-Prayer [Fri, 5 Jul 2019 12:57:29 +0000 (14:57 +0200)]
radeonsi: verify buffer_offset value before using it
This buffer_ofset can come directly from the application (e.g: when using
glVertexAttribPointer) and can contain an invalid value.
st_atom_array already makes sure that if it's not negative so all that's left
is to verify that it's smaller that the buffer size.
Bugs related to this issue:
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105251#c52
Bugzilla: https://bugzilla.freedesktop.org/show_bug.cgi?id=109693
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Fri, 5 Jul 2019 12:51:23 +0000 (14:51 +0200)]
st/mesa: verify that vertex buffer offset isn't negative
For drivers supporting PIPE_CAP_SIGNED_VERTEX_BUFFER_OFFSET the buffer_offset value
will be interpreted as an signed int.
An example of application code causing a negative offset:
float b[] = { ... }; // 3 float for pos, 3 for color
glBufferData(GL_ARRAY_BUFFER, ..., b, ...);
glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 6 * sizeof(float), 0);
glVertexAttribPointer(1, 3, GL_FLOAT, GL_FALSE, 6 * sizeof(float), &b[3]);
^
should be 3 * sizeof(float)
The offset is a ptr so when interpreted as a signed int it can be negative.
This commit adds a verification that (int) buffer_offset is not negative - this would
indicate an application bug. Since it's too late to emit a GL_INVALID_VALUE error,
we replace the negative offset by 0 and emit a debug message.
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Marek Olšák [Wed, 3 Jul 2019 22:51:24 +0000 (18:51 -0400)]
st/mesa: don't invalidate a buffer range that is mapped
This is needed to fix an issue with OpenGL when a buffer is mapped and
BufferSubData is called. In this case, we can't invalidate the buffer range.
Marek Olšák [Wed, 3 Jul 2019 22:51:24 +0000 (18:51 -0400)]
gallium: use MAP_DIRECTLY to mean supression of DISCARD in buffer_subdata
This is needed to fix an issue with OpenGL when a buffer is mapped and
BufferSubData is called. In this case, we can't invalidate the buffer range.
Kenneth Graunke [Fri, 12 Jul 2019 20:52:35 +0000 (13:52 -0700)]
iris: Better handle decoder base addresses
It can be useful to call the decoder on a single batch. But, that batch
may not contain STATE_BASE_ADDRESS, at which point the decoder will have
no idea how to find any buffers. We can initialize the two static bases
at the beginning of time, so it has them even if it never sees SBA.
Surface base address changes dynamically, possibly in the middle of a
batch. So we update it at the start of each batch, making it always
start at the value we inherited from the previous one. SBA commands
inside the batch can update it to a proper value.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Samuel Pitoiset [Mon, 15 Jul 2019 16:46:48 +0000 (18:46 +0200)]
radv/gfx10: enable OC_LDS_EN for NGG GS if the ES stage is TES
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Bas Nieuwenhuizen [Mon, 8 Jul 2019 10:47:14 +0000 (12:47 +0200)]
anv: Add android dependencies on android.
Specifically needed for nativewindow for some VK_EXT_external_memory_android_hardware_buffers
functions, where we call into some AHardwareBuffer functions.
The legacy Android ext did not have us call into any Android function
at all and hence it was not noticed.
Fixes: 755c633b8d9 "anv: Fix vulkan build in meson."
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Alyssa Rosenzweig [Mon, 15 Jul 2019 14:12:47 +0000 (07:12 -0700)]
panfrost: Advertise more depth/stencil formats
Fixes a regression in glmark's shadow/refract scenes.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Mon, 15 Jul 2019 14:10:31 +0000 (07:10 -0700)]
panfrost/mfbd: Add Z32 rendering support
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Mon, 15 Jul 2019 14:08:15 +0000 (07:08 -0700)]
panfrost: Fix blend_cso if nr_cbufs == 0
Fixes: 46396af1ec4b69ca4a ("panfrost: Refactor blend infrastructure")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Fri, 12 Jul 2019 23:53:52 +0000 (16:53 -0700)]
panfrost: Cleanup shader upload code
The old algorithm is still used (and the same issue -- namely, leaking
all shaders -- applies) but we're way more concise about it since we're
*only* using the routine for shaders nowadays; everything else is a
BO-proper or transient.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Fri, 12 Jul 2019 23:45:44 +0000 (16:45 -0700)]
panfrost: Remove all old allocators
With the new refactor, this all becomes dead code.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Fri, 12 Jul 2019 23:38:11 +0000 (16:38 -0700)]
panfrost: Use transient memory for occlusion queries
These only last a frame anyway.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Fri, 12 Jul 2019 23:37:45 +0000 (16:37 -0700)]
panfrost: Remove bizarre hack
I don't think this is still necessary, and if it is, we'll have to
figure out how to fix it the right way.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Fri, 12 Jul 2019 23:35:47 +0000 (16:35 -0700)]
panfrost: Upload vertex descriptors to *transient* memory
It's not legal to reuse the vertex shader descriptor across frames now
that we patch it at draw-time, so upload to transient memory.
Ideally, we could be smarter about this such that subsequent draws with
the same vertex shader and same patched state would reuse the
descriptor, but for now, let's simply achieve correctness.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Fri, 12 Jul 2019 22:50:58 +0000 (15:50 -0700)]
panfrost: Delay resource mmaps
We use the new PAN_ALLOCATE_DELAY_MMAP flag to only map resources
on-demand, which should avoid mapping FBOs.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Fri, 12 Jul 2019 22:45:28 +0000 (15:45 -0700)]
panfrost: Cleanup PAN_ALLOCATE_*
While we're at it, prompted by a semantics issue around INVISIBLE, also
add a separate DELAY_MMAP flag.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Fri, 12 Jul 2019 22:39:47 +0000 (15:39 -0700)]
panfrost/drm: Don't mmap INVISIBLE buffers
On the new kernel, mmaping doesn't *hurt* per se, but it's still
wasteful for buffers explicitly marked as not needing an mmap.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Lionel Landwerlin [Mon, 15 Jul 2019 12:35:11 +0000 (15:35 +0300)]
anv: fix crash in vkCmdClearAttachments with unused attachment
anv_render_pass_compile() turns an unused attachment into a NULL
depth_stencil_attachment pointer so check that pointer before
accessing it.
Found with updates to existing CTS tests.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 208be8eafa30be ("anv: Make subpass::depth_stencil_attachment a pointer")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Samuel Pitoiset [Wed, 10 Jul 2019 22:34:18 +0000 (00:34 +0200)]
radv/gfx10: export the PrimitiveID for ES stages (VS or TES)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 10 Jul 2019 22:29:50 +0000 (00:29 +0200)]
radv/gfx10: declare an external symbol for the ESGS ring
It will be used for stream output but for now only declares it
if VS and if the PrimitiveID needs to be exported.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 10 Jul 2019 22:25:28 +0000 (00:25 +0200)]
radv/gfx10: allocate ESGS ring space for exporting PrimitiveID
Only VS needs that. We shouldn't hardcode these values but
that's complicated to not do that for now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Sun, 14 Jul 2019 10:55:48 +0000 (12:55 +0200)]
radv/gfx10: fix crash when emitting NGG GS prologue
ac_nir_context is initialized after the driver emits the NGG GS
prologue so it's likely to crash.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Vasily Khoruzhick [Mon, 15 Jul 2019 01:22:36 +0000 (18:22 -0700)]
lima/ppir: Fix branch codegen
"unknown_2" field is actually a size of instruction that branch
points to. If it's set to a smaller size than actual instruction
branch behavior is not defined (and it usually wedges the GPU).
Fix it by setting this field correctly.
Fixes: af0de6b91c0b ("lima/ppir: implement discard and discard_if")
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Vasily Khoruzhick [Mon, 15 Jul 2019 01:21:57 +0000 (18:21 -0700)]
lima/ppir: Fix assert condition in ppir_codegen_encode_discard
Fixes: af0de6b91c0b ("lima/ppir: implement discard and discard_if")
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Jonathan Marek [Wed, 19 Jun 2019 21:49:47 +0000 (17:49 -0400)]
etnaviv: fix incorrect varying interpolation
This corresponds to what the GC3000 blob does. The USED / UNUSED enums are
wrong, at least for GC2000/GC3000.
Without this the 3rd texture component is not interpolated correctly (flat?)
in the following test (and others):
dEQP-GLES2.functional.texture.mipmap.cube.generate.rgba8888_nicest
Strangely, when the texture is sampled from OpenGL it works correctly,
the problem only shows up for sampling by gallium/blitter. This fixes other
cube map tests which use util_blitter_blit.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Jonathan Marek [Wed, 19 Jun 2019 21:16:27 +0000 (17:16 -0400)]
etnaviv: reduce rs alignment requirement for two pixel pipes GPU
The rs alignment doesn't have to be multiplied by # of pixel pipes.
This works on GC2000 which doesn't have the SINGLE_BUFFER feature.
This fixes some cubemaps (NPOT / small mipmap levels) because aligning by 8
breaks the expected alignment of 4 for tiled format. We don't want to mess
with the alignment of tiled formats.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Jonathan Marek [Wed, 19 Jun 2019 15:43:54 +0000 (11:43 -0400)]
etnaviv: fix nearest_linear / linear_nearest filtering on GC3000
The MIN filter is never used when not using mipmaps. This fixes that.
Interestingly, only GC3000 needs this (GC2000 works without this fix).
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Jonathan Marek [Wed, 19 Jun 2019 15:42:13 +0000 (11:42 -0400)]
etnaviv: fix nearest filtering
ROUND_UV rounding breaks nearest filtering.
Enable it only when nearest filtering isn't used.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Bas Nieuwenhuizen [Sat, 13 Jul 2019 14:05:40 +0000 (16:05 +0200)]
radv/gfx10: Fix DCC clears.
Looks like if the reg clear bit is set, the hwardware does not use the 0/1
clears for textures.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Vinson Lee [Thu, 13 Jun 2019 22:08:27 +0000 (15:08 -0700)]
meson: Add dep_thread dependency.
Fix this build error on Ubuntu 18.04.
/usr/bin/ld: src/util/libmesa_util.a(u_cpu_detect.c.o): undefined reference to symbol 'pthread_once@@GLIBC_2.2.5'
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110663
Suggested-by: Eric Engestrom <eric@@engestrom.ch>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Eric Anholt [Thu, 11 Jul 2019 19:58:28 +0000 (12:58 -0700)]
gitlab-ci: Build i386 and ARM drivers in surfaceless mode.
I don't particularly care about getting x86/ARM cross-build coverage
of all the window systems, but we do want to be building src/mesa/
(for x86 asm) and gallium drivers (for vc4 NEON asm). I'm also hoping
to use these build products for testing freedreno on actual HW (which
we do using surfaceless).
This increases the docker image from 1.4G to 1.5G.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Andreas Baierl [Thu, 11 Jul 2019 13:26:24 +0000 (15:26 +0200)]
lima: Fix compiler warnings for unused functions.
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Caio Marcelo de Oliveira Filho [Fri, 12 Jul 2019 21:37:38 +0000 (14:37 -0700)]
anv: Fix pool allocator when first alloc needs to grow
When using softpin, the first allocation was not calculating the
padding and offset correctly for the case the first allocation needed
to grow. We were missing initialize the state.end right after
expanding the pool for the first time.
This is not a problem for non-softpin since there we don't use
leftover padding so the ends would re-arrange incrementally.
This fixes running dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13 in
SKL -- the test uses a shader larger than the initial size for the
instruction pool.
Fixes: dfc9ab2ccd9 "anv/allocator: Add padding information."
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Kenneth Graunke [Sun, 31 May 2015 23:02:36 +0000 (16:02 -0700)]
mesa: Port errors.c to util/list.h instead of simple_list.
There is widespread consensus that simple_list should go away.
This patch converts one more use to the modern kernel-style list.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Jason Ekstrand [Fri, 12 Jul 2019 23:47:15 +0000 (18:47 -0500)]
intel: Run the optimization loop before and after lowering int64
For bindless SSBO access, we have to do 64-bit address calculations. On
ICL and above, we don't have 64-bit integer support so we have to lower
the address calculations to 32-bit arithmetic. If we don't run the
optimization loop before lowering, we won't fold any of the address
chain calculations before lowering 64-bit arithmetic and they aren't
really foldable afterwards. This cuts the size of the generated code in
the compute shader in dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13 by
around 30%.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Alyssa Rosenzweig [Fri, 12 Jul 2019 15:57:10 +0000 (08:57 -0700)]
panfrost/decode: Drop _replay prefix
We don't even support replay anymore; this is just wasting characters
and adding clutter.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Fri, 12 Jul 2019 15:54:49 +0000 (08:54 -0700)]
panfrost/decode: Drop _name suffixes
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Fri, 12 Jul 2019 15:47:35 +0000 (08:47 -0700)]
panfrost/decode: Add MEMORY_PROP_DIR variant
This allows dumping memory properties directly without dereferencing an
address, allowing us to fix more -Waddress-of-packed-member warnings.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Fri, 12 Jul 2019 15:45:51 +0000 (08:45 -0700)]
panfrost/decode: Copy embedded structs before using
Fixes some, but not all, warnings from -Waddress-of-packed-member
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Fri, 12 Jul 2019 15:45:36 +0000 (08:45 -0700)]
panfrost/decode: Remove pandecode_decode_fbd_type
It is unused.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Fri, 12 Jul 2019 15:41:13 +0000 (08:41 -0700)]
panfrost/midgard: Use generic outmod type
It could be midgard_outmod_float or midgard_outmod_int; don't assume
it's one or the other. Fixes -Wenum-conversion warnings.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Fri, 12 Jul 2019 21:48:34 +0000 (14:48 -0700)]
panfrost: Precompute scoreboard dependents
Mali job dependency graphs, at least for GLES3.0, have the special
property that a given node will only have at most a single dependent.
This allows us to efficiently precompute the dependent array and
replace an inner loop's O(N) search with an O(1) lookup, bringing the
algorithmic complexity of scoreboarding from O(N^2) to O(N).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Fri, 12 Jul 2019 21:09:57 +0000 (14:09 -0700)]
panfrost: Remove transient pool abstraction
Now that it has been totally replaced by the borrow mechanism, it is now
unused code.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Fri, 12 Jul 2019 20:59:35 +0000 (13:59 -0700)]
panfrost: Subdivide fixed-size transient slabs
The whole purpose of the transient memory model is to make subdivision
stupidly easy, so let's handle that.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Fri, 12 Jul 2019 20:05:14 +0000 (13:05 -0700)]
panfrost: Recycle fixed-size transient BOs
The usual case. We use the bitset to mark freedom and seize it.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Fri, 12 Jul 2019 19:53:36 +0000 (12:53 -0700)]
panfrost: Bookkeep transient indices
The batch now temporarily possesses the transient buffer, so it'll need
to remember that to free it later.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Fri, 12 Jul 2019 19:49:23 +0000 (12:49 -0700)]
panfrost: Rewrite allocate_transient with new abstraction
We use a fixed size slab if we can, otherwise we create a dedicated
("oversized") BO and add that to the job. In the latter case we'll get
reference counting for free so we can forget about this corner case for
the rest of the series.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Fri, 12 Jul 2019 20:55:45 +0000 (13:55 -0700)]
panfrost: Add pan_bo_for_screen helper
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Thu, 11 Jul 2019 17:34:40 +0000 (10:34 -0700)]
panfrost: Add panfrost_transient_bo array
We would like transient allocations to occur on the screen (borrowed by
the batch) rather than on the context. Add fields to track this.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>