freedreno/a3xx: refactor/optimize emit
Because we reuse various bits of emit code (for state/vertex/prog/etc)
for both regular draws and internal draws (gmem<->mem, clear, etc), the
number of parameters getting passed around has been growing. Refactor
to group these into fd3_emit. This simplifies fxn signatures, avoids
passing around shader key on the stack, etc. It also gives us a nice
place to cache shader-variant lookup to avoid looking up shader variants
multiple times per draw (without having to *also* pass them around as
fxn args everywhere).
Signed-off-by: Rob Clark <robclark@freedesktop.org>