From: Matt Turner Date: Tue, 2 Feb 2016 00:44:18 +0000 (-0800) Subject: nir: Recognize open-coded bitfield_reverse. X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=371c4b3c48f;p=mesa.git nir: Recognize open-coded bitfield_reverse. Helps 11 shaders in UnrealEngine4 demos. I seriously hope they would have given us bitfieldReverse() if we exposed GL 4.0 (but we do expose ARB_gpu_shader5, so why not use that anyway?). instructions in affected programs: 4875 -> 4633 (-4.96%) cycles in affected programs: 270516 -> 244516 (-9.61%) I suspect there's a *lot* of room to improve nir_search/opt_algebraic's handling of this. We'd actually like to match, e.g., step2 by matching step1 once and then doing a pointer comparison for the second instance of step1, but unfortunately we generate an enormous tuple for instead. The .text size increases by 6.5% and the .data by 17.5%. text data bss dec hex filename 22957 45224 0 68181 10a55 nir_libnir_la-nir_opt_algebraic.o 24461 53160 0 77621 12f35 nir_libnir_la-nir_opt_algebraic.o I'd be happy to remove this if Unreal4 uses bitfieldReverse() if it is in a GL 4.0 context once we expose GL 4.0. Reviewed-by: Jason Ekstrand --- diff --git a/src/compiler/nir/nir_opt_algebraic.py b/src/compiler/nir/nir_opt_algebraic.py index 9b82bac6f6c..cc2c2299ab9 100644 --- a/src/compiler/nir/nir_opt_algebraic.py +++ b/src/compiler/nir/nir_opt_algebraic.py @@ -312,6 +312,19 @@ optimizations = [ 'options->lower_unpack_snorm_4x8'), ] +# Unreal Engine 4 demo applications open-codes bitfieldReverse() +def bitfield_reverse(u): + step1 = ('ior', ('ishl', u, 16), ('ushr', u, 16)) + step2 = ('ior', ('ishl', ('iand', step1, 0x00ff00ff), 8), ('ushr', ('iand', step1, 0xff00ff00), 8)) + step3 = ('ior', ('ishl', ('iand', step2, 0x0f0f0f0f), 4), ('ushr', ('iand', step2, 0xf0f0f0f0), 4)) + step4 = ('ior', ('ishl', ('iand', step3, 0x33333333), 2), ('ushr', ('iand', step3, 0xcccccccc), 2)) + step5 = ('ior', ('ishl', ('iand', step4, 0x55555555), 1), ('ushr', ('iand', step4, 0xaaaaaaaa), 1)) + + return step5 + +optimizations += [(bitfield_reverse('x'), ('bitfield_reverse', 'x'))] + + # Add optimizations to handle the case where the result of a ternary is # compared to a constant. This way we can take things like #