anv: Split dispatch tables into device and instance
There's no reason why we need generate trampoline functions for instance
functions or carry N copies of the instance dispatch table around for
every hardware generation. Splitting the tables and being more
conservative shaves about 34K off .text and about 4K off .data when
built with clang.
Before splitting dispatch tables:
text data bss dec hex filename
3224305 286216 8960
3519481 35b3f9 _install/lib64/libvulkan_intel.so
After splitting dispatch tables:
text data bss dec hex filename
3190325 282232 8960
3481517 351fad _install/lib64/libvulkan_intel.so
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>