intel/tools/error: Decode shaders while decoding batch commands.
This makes aubinator_error_decode's shader dumping work like aubinator.
Instead of printing them after the fact, it prints them right inside the
3DSTATE_VS/HS/DS/GS/PS packet that references them. This saves you the
effort of cross-referencing things and jumping back and forth.
It also reduces a bunch of book-keeping, and eliminates the limitation
that we could only handle 4096 programs. That code was also broken and
failed to print any shaders if there were under 4096 programs.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>