Clean up benchmarks; support uarch-specific counters