From f843aabdd42051abc8b22437d5c9167fc867ac46 Mon Sep 17 00:00:00 2001 From: =?utf8?q?Kristian=20H=C3=B8gsberg=20Kristensen?= Date: Mon, 22 Feb 2016 09:14:25 -0800 Subject: [PATCH] intel/genxml: Add README I've had people ask about the design of the pack functions, for example, why aren't we using bitfields. I wrote up a bit of background on why and how we ended up with the current design and we might as well keep that with the code. --- src/intel/genxml/README | 60 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 60 insertions(+) create mode 100644 src/intel/genxml/README diff --git a/src/intel/genxml/README b/src/intel/genxml/README new file mode 100644 index 00000000000..bc518c60bad --- /dev/null +++ b/src/intel/genxml/README @@ -0,0 +1,60 @@ +This provides some background the design of the generated headers. We +started out trying to generate bit fields but it evolved into the pack +functions because of a few limitations: + + 1) Bit fields still generate terrible code today. Even with modern + optimizing compilers you get multiple load+mask+store operations + to the same dword in memory as you set individual bits. The + compiler also has to generate code to mask out overflowing values + (for example, if you assign 200 to a 2 bit field). Our driver + never writes overflowing values so that's not needed. On the + other hand, most compiler recognize that the template struct we + use is a temporary variable and copy propagate the individual + fields and do amazing constant folding. You should take a look + at the code that gets generated when you compile in release mode + with optimizations. + + 2) For some types we need to have overlapping bit fields. For + example, some values are 64 byte aligned 32 bit offsets. The + lower 5 bits of the offset are always zero, so the hw packs in a + few misc bits in the lower 5 bits there. Other times a field can + be either a u32 or a float. I tried to do this with overlapping + anonymous unions and it became a big mess. Also, when using + initializers, you can only initialize one union member so this + just doesn't work with out approach. + + The pack functions on the other hand allows us a great deal of + flexibility in how we combine things. In the case of overlapping + fields (the u32 and float case), if we only set one of them in + the pack function, the compiler will recognize that the other is + initialized to 0 and optimize out the code to or it it. + + 3) Bit fields (and certainly overlapping anonymous unions of bit + fields) aren't generally stable across compilers in how they're + laid out and aligned. Our pack functions let us control exactly + how things get packed, using only simple and unambiguous bitwise + shifting and or'ing that works on any compiler. + +Once we have the pack function it allows us to hook in various +transformations and validation as we go from template struct to dwords +in memory: + + 1) Validation: As I said above, our driver isn't supposed to write + overflowing values to the fields, but we've of course had lots of + cases where we make mistakes and write overflowing values. With + the pack function, we can actually assert on that and catch it at + runtime. bitfields would just silently truncate. + + 2) Type conversions: some times it's just a matter of writing a + float to a u32, but we also convert from bool to bits, from + floats to fixed point integers. + + 3) Relocations: whenever we have a pointer from one buffer to + another (for example a pointer from the meta data for a texture + to the raw texture data), we have to tell the kernel about it so + it can adjust the pointer to point to the final location. That + means extra work we have to do extra work to record and annotate + the dword location that holds the pointer. With bit fields, we'd + have to call a function to do this, but with the pack function we + generate code in the pack function to do this for us. That's a + lot less error prone and less work. -- 2.30.2