i965: Fix asynchronous mappings on !LLC platforms.
When using a read-only CPU mapping, we may encounter stale buffer
contents. For example, the Piglit primitive-restart test offers the
following scenario:
1. Read data via a CPU map.
2. Destroy that buffer.
3. Create a new buffer - obtaining the same one via the BO cache.
4. Call BufferSubData, which does a GTT map with MAP_WRITE | MAP_ASYNC.
(We avoid set_domain for async mappings, so no flushing occurs.)
5. Read data via a CPU map.
(Without explicit clflushing, this will contain data from step 1!)
Otherwise, everything ought to work, keeping in mind that we never use
CPU maps for writing - just read-only CPU maps.
This restores the performance gains after Matt's revert in commit
71651b3139c501f50e6547c21a1cdb816b0a9dde.
v2: Do the invalidate later, and even when asking for a brand new map.
v3: Add more comments from Chris.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>