intel/compiler: Properly consider UBO loads that cross 32B boundaries.
authorKenneth Graunke <kenneth@whitecape.org>
Fri, 8 Jun 2018 21:24:16 +0000 (14:24 -0700)
committerKenneth Graunke <kenneth@whitecape.org>
Thu, 14 Jun 2018 21:58:59 +0000 (14:58 -0700)
commitf6898f2b554e88255909bdd6bd0f7a91d99c446b
treecbead1bd0a96a88aeee807b09e7f2b90ea2eef51
parent37bd9ccd21b860d2b5ffea7e1f472ec83b68b43b
intel/compiler: Properly consider UBO loads that cross 32B boundaries.

The UBO push analysis pass incorrectly assumed that all values would fit
within a 32B chunk, and only recorded a bit for the 32B chunk containing
the starting offset.

For example, if a UBO contained the following, tightly packed:

   vec4 a;  // [0, 16)
   float b; // [16, 20)
   vec4 c;  // [20, 36)

then, c would start at offset 20 / 32 = 0 and end at 36 / 32 = 1,
which means that we ought to record two 32B chunks in the bitfield.

Similarly, dvec4s would suffer from the same problem.

v2: Rewrite the accounting, my calculations were wrong.
v3: Write a comment about partial values (requested by Jason).

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> [v1]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v3]
src/intel/compiler/brw_nir_analyze_ubo_ranges.c