i965/vec4: emit correctly load_inputs for 64bit data
authorJuan A. Suarez Romero <jasuarez@igalia.com>
Wed, 6 Jul 2016 10:40:49 +0000 (12:40 +0200)
committerJuan A. Suarez Romero <jasuarez@igalia.com>
Thu, 12 Jan 2017 11:56:56 +0000 (12:56 +0100)
For dvec3 and dvec4 types, a single GRF do not have enough space to
allocate two inputs from two different vertices (SIMD4x2).

So the GRF only contains first two components for the two vertices, and
the next GRF has the remaining components.

We want to put all the components for the same vertex in the same
register. Thus, we do a shuffle to reorder the data.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
src/mesa/drivers/dri/i965/brw_vec4_nir.cpp

index 98e023a66d9631da6f9571c7fb4569ac3e539964..71156ec5b3b4e416c043ce4203e04660404c7c74 100644 (file)
@@ -417,15 +417,24 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr)
       /* We set EmitNoIndirectInput for VS */
       assert(const_offset);
 
+      dest = get_nir_dest(instr->dest);
+      dest.writemask = brw_writemask_for_size(instr->num_components);
+
       src = src_reg(ATTR, instr->const_index[0] + const_offset->u32[0],
                     glsl_type::uvec4_type);
-      /* Swizzle source based on component layout qualifier */
-      src.swizzle = BRW_SWZ_COMP_INPUT(nir_intrinsic_component(instr));
-
-      dest = get_nir_dest(instr->dest, src.type);
-      dest.writemask = brw_writemask_for_size(instr->num_components);
+      src = retype(src, dest.type);
 
-      emit(MOV(dest, src));
+      bool is_64bit = nir_dest_bit_size(instr->dest) == 64;
+      if (is_64bit) {
+         dst_reg tmp = dst_reg(this, glsl_type::dvec4_type);
+         src.swizzle = BRW_SWIZZLE_XYZW;
+         shuffle_64bit_data(tmp, src, false);
+         emit(MOV(dest, src_reg(tmp)));
+      } else {
+         /* Swizzle source based on component layout qualifier */
+         src.swizzle = BRW_SWZ_COMP_INPUT(nir_intrinsic_component(instr));
+         emit(MOV(dest, src));
+      }
       break;
    }