gallium/tgsi_exec: Fix up NumOutputs counting
We can get duplicate declarations for an index (for example dvec3 + float
packed into 2 vec4s, the second one won't pack into the first's array
decl), and we'd end up stepping by the wrong amount in GS vtx/prim emit.
Fixes vs-gs-fs-double, sso-vs-gs-fs-array-interleave piglit tests.
Fixes: 49155c3264d0 ("draw/tgsi: fix geometry shader input/output swizzling")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6567>