The previous code would emit a full set of state during the first EmitState on
a new cmdbuf, to ensure that state wasn't lost across UNLOCK/LOCK pairs (in the
case of context switching). This was rather inefficient. Instead, after
flushing a cmdbuf, mark the state as needing to be saved on UNLOCK. Then, at
the beginning of flushing a cmdbuf, if we actually have lost the context, go
back and emit a new cmdbuf with the full set of state, before continuing with
the cmdbuf flush. Also, remove the dirty/clean atom lists, since atoms are
emitted in a fixed order these days, and go with a simpler single list.
Provides a 14% improvement in ipers performance in my tests, along with other
apps.