implement arb_vertex_program in hw for r200. Code contains still some hacks, generic attribs cause a fallback, but otherwise it seems to work quite well. Passes all glean vertProg1 tests with the exception of the degnerated LIT case (which is a hw limitation), as well as runs the r200 render path of doom3/quake4 (1.1 patch needed for quake4). The code is heavily borrowed from the r300 driver as vertex programs encoding is almost identical. arb_vertex_program is not yet announced by default and still needs to be enabled via driconf.