llvmpipe: handle indirect images properly Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3778>
gallivm/nir: refactor image operations for indirect support. This just refactors the image code, so that outdata is passed explicitly, and refactors the internal handling of NONE formats. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3778>
llvmpipe: pass number of images into image soa create Just store this for now to use later with indexing Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3778>
llvmpipe/draw: wire up indirect offset This bounds checks and adds to the llvm index. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3778>
gallivm/sample: pass indirect offset into texture/image units This isn't needed for the basic indirect code, but it is needed for texture size/image size unfortunately. They could be done with a super switch, but it seems simple to query them. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3778>
llvmpipe: add support for indirect texture access. This hooks up the sampler switch statement generator to the llvmpipe sampler interface. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3778>
llvmpipe: pass number of samplers into llvm sampler code. This is to be used later for indirect texture access Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3778>
llvmpipe: add samples support to image jit Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4122>
llvmpipe: add num_samples/sample_stride support to jit textures This adds the support for num_samples/sample_stride retrieval to the jit texture infrastructure. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4122>
llvmpipe: add fragment shader image support Reviewed-by: Roland Scheidegger <sroland@vmware.com>
llvmpipe: s/Elements/ARRAY_SIZE/ Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
gallivm: convert size query to using a set of parameters. This isn't currently that easy to expand, so fix it up before expanding it later to include dynamic samplers. [airlied: use some local variables (Roland)] Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
gallium/drivers: Trivial code-style cleanup Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>
llvmpipe: add cache for compressed textures compressed textures are very slow because decoding is rather complex (and because there's no jit code code to decode them too for non-technical reasons). Thus, add some texture cache which holds a couple of decoded blocks. Right now this handles only s3tc format albeit it could be extended to work with other formats rather trivially as long as the result of decode fits into 32bit per texel (ideally, rgtc actually would decode to more than 8 bits per channel, but even then making it work for it shouldn't be too difficult). This can improve performance noticeably but don't expect wonders (uncompressed is unsurprisingly still faster). It's also possible it might be slower in some cases (using nearest filtering for example or if there's otherwise not many cache hits, the cache is only direct mapped which isn't great). Also, actual decode of a block relies on util code, thus even though always full blocks are decoded it is done texel by texel - this could obviously benefit greatly from simd-optimized code decoding full blocks at once... Note the cache is per (raster) thread, and currently only used for fragment shaders. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
gallivm: simplify sampler interface This has got a bit out of control with more and more parameters added. Worse, whenever something in there changes all callees have to be updated for that, even though they don't really do much with any parameter in there except pass it on to the actual sampling function. Hence simply put almost everything into a struct. Also instead of relying on some arguments being NULL, be explicit and set this in a key (which is just reused for function generation for simplicity). (The code still relies on them being NULL in the end for now.) Technically there is a minimal functional change here for shadow sampling: if shadow sampling is done is now determined explicitly by the texture function (either sample_c or the gl-style tex func inherit this from target) instead of the static texture state. These two should always match, however. Otherwise, it should generate all the same code. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
gallivm: pass jit_context pointer through to sampling The callbacks used for getting the dynamic texture/sampler state were using the jit_context from the generated jit function. This works just fine, however that way it's impossible to generate separate functions for texture sampling, as will be done in the next commit. Hence, pass this pointer through all interfaces so it can be passed to a separate function (technically, it would probably be possible to extract this pointer from the current function instead, but this feels hacky and would probably require some more hacks if we'd use real functions instead of inlining all shader functions at some point). There should be no difference in the generated code for now. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
gallivm: implement better control of per-quad/per-element/scalar lod There's a new debug value used to disable per-quad lod optimizations in fragment shader (ignored for vs/gs as the results are just too wrong typically). Also trying to detect if a supplied lod value is really a scalar (if it's coming from immediate or constant file) in which case sampler code can use this to stay on per-quad-lod path (in fact for explicit lod could simplify even further and use same lod for both quads in the avx case but this is not implemented yet). Still need to actually implement per-element lod bias (and derivatives), and need to handle per-element lod in size queries. v2: fix comments, prettify. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
gallivm: set non-existing values really to zero in size queries for d3d10 My previous attempt at doing so double-failed miserably (minification of zero still gives one, and even if it would not the value was never written anyway). While here also rename the confusingly named int_vec bld as we have int vecs of different sizes, and rename need_nr_mips (as this also changes out-of-bounds behavior) to is_sviewinfo too. Reviewed-by: Zack Rusin <zackr@vmware.com>
gallivm: use texture target from shader instead of static state for size query d3d10 has no notion of distinct array resources neither at the resource nor sampler view level. However, shader dcl of resources certainly has, and d3d10 expects resinfo to return the values according to that - in particular a resource might have been a 1d texture with some array layers, then the sampler view might have only used 1 layer so it can be accessed both as 1d or 1d array texture (I think - the former definitely works). resinfo of a resource decleared as array needs to return number of array layers but non-array resource needs to return 0 (and not 1). Hence fix this by passing the target from the shader decl to emit_size_query and use that (in case of OpenGL the target will come from the instruction itself). Could probably do the same for actual sampling, though it may not matter there (as the bogus components will essentially get clamped away), possibly could wreak havoc though if it REALLY doesn't match (which is of course an error but still). Reviewed-by: Zack Rusin <zackr@vmware.com>
gallivm: propagate scalar_lod to emit_size_query too Clearly the returned values need to be per-element if the lod is per element. Does not actually change behavior yet. Reviewed-by: Zack Rusin <zackr@vmware.com>