Less IFETCH latency on misses. Shader code is write once read many,
so GTT doesn't make much sense anyway.
If it turns out to fragment the CPU visible VRAM too much, we can upload with SDMA.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
S_00B848_FLOAT_MODE(variant->config.float_mode);
variant->bo = device->ws->buffer_create(device->ws, binary->code_size, 256,
- RADEON_DOMAIN_GTT, RADEON_FLAG_CPU_ACCESS);
+ RADEON_DOMAIN_VRAM, RADEON_FLAG_CPU_ACCESS);
void *ptr = device->ws->buffer_map(variant->bo);
memcpy(ptr, binary->code, binary->code_size);
variant->ref_count = 1;
variant->bo = device->ws->buffer_create(device->ws, entry->code_size, 256,
- RADEON_DOMAIN_GTT, RADEON_FLAG_CPU_ACCESS);
+ RADEON_DOMAIN_VRAM, RADEON_FLAG_CPU_ACCESS);
void *ptr = device->ws->buffer_map(variant->bo);
memcpy(ptr, entry->code, entry->code_size);