-- --
-- S p e c --
-- --
--- Copyright (C) 2004-2011, Free Software Foundation, Inc. --
+-- Copyright (C) 2004-2017, Free Software Foundation, Inc. --
-- --
-- GNAT is free software; you can redistribute it and/or modify it under --
-- terms of the GNU General Public License as published by the Free Soft- --
-- | | | |
-- Vector_Types Vector_Operations Vector_Views Conversions
--- The user can manipulate vectors through two families of types: Vector
+-- Users can manipulate vectors through two families of types: Vector
-- types and View types.
--- Vector types are defined in the GNAT.Altivec.Vector_Types package
+-- Vector types are available through the Vector_Types and Vector_Operations
+-- packages, which implement the core binding to the AltiVec API, as
+-- described in [PIM-2.1 data types] and [PIM-4 AltiVec operations and
+-- predicates].
--- On these types, users can apply the Altivec operations defined in
--- GNAT.Altivec.Vector_Operations. Their layout is opaque and may vary across
--- configurations, for it is typically target-endianness dependant.
+-- The layout of Vector objects is dependant on the target machine
+-- endianness, and View types were devised to offer a higher level user
+-- interface. With Views, a vector of 4 uints (1, 2, 3, 4) is always declared
+-- with a VUI_View := (Values => (1, 2, 3, 4)), element 1 first, natural
+-- notation to denote the element values, and indexed notation is available
+-- to access individual elements.
--- Vector_Types and Vector_Operations implement the core binding to the
--- AltiVec API, as described in [PIM-2.1 data types] and [PIM-4 AltiVec
--- operations and predicates].
-
--- View types are defined in the GNAT.Altivec.Vector_Views package
-
--- These types do not represent Altivec vectors per se, in the sense that the
+-- View types do not represent Altivec vectors per se, in the sense that the
-- Altivec_Operations are not available for them. They are intended to allow
-- Vector initializations as well as access to the Vector component values.
-- The "hard" version would map to real AltiVec instructions via GCC builtins
-- and inlining.
+-- See the "Design Notes" section below for additional details on the
+-- internals.
+
-------------------
-- Example usage --
-------------------
CR6_LT_REV : constant := 3;
end GNAT.Altivec;
+
+--------------------
+-- Design Notes --
+--------------------
+
+------------------------
+-- General principles --
+------------------------
+
+-- The internal organization has been devised from a number of driving ideas:
+
+-- o From the clients standpoint, the two versions of the binding should be
+-- as easily exchangable as possible,
+
+-- o From the maintenance standpoint, we want to avoid as much code
+-- duplication as possible.
+
+-- o From both standpoints above, we want to maintain a clear interface
+-- separation between the base bindings to the Motorola API and the
+-- additional facilities.
+
+-- The identification of the low level interface is directly inspired by the
+-- the base API organization, basically consisting of a rich set of functions
+-- around a core of low level primitives mapping to AltiVec instructions.
+
+-- See for instance "vec_add" in [PIM-4.4 Generic and Specific AltiVec
+-- operations]: no less than six result/arguments combinations of byte vector
+-- types map to "vaddubm".
+
+-- The "hard" version of the low level primitives map to real AltiVec
+-- instructions via the corresponding GCC builtins. The "soft" version is
+-- a software emulation of those.
+
+---------------------------------------
+-- The Low_Level_Vectors abstraction --
+---------------------------------------
+
+-- The AltiVec C interface spirit is to map a large set of C functions down
+-- to a much smaller set of AltiVec instructions, most of them operating on a
+-- set of vector data types in a transparent manner. See for instance the
+-- case of vec_add, which maps six combinations of result/argument types to
+-- vaddubm for signed/unsigned/bool variants of 'char' components.
+
+-- The GCC implementation of this idiom for C/C++ is to setup builtins
+-- corresponding to the instructions and to expose the C user function as
+-- wrappers around those builtins with no-op type conversions as required.
+-- Typically, for the vec_add case mentioned above, we have (altivec.h):
+--
+-- inline __vector signed char
+-- vec_add (__vector signed char a1, __vector signed char a2)
+-- {
+-- return (__vector signed char)
+-- __builtin_altivec_vaddubm ((__vector signed char) a1,
+-- (__vector signed char) a2);
+-- }
+
+-- inline __vector unsigned char
+-- vec_add (__vector __bool char a1, __vector unsigned char a2)
+-- {
+-- return (__vector unsigned char)
+-- __builtin_altivec_vaddubm ((__vector signed char) a1,
+-- (__vector signed char) a2);
+-- }
+
+-- The central idea for the Ada bindings is to leverage on the existing GCC
+-- architecture, with the introduction of a Low_Level_Vectors abstraction.
+-- This abstaction acts as a representative of the vector-types and builtins
+-- compiler interface for either the Hard or the Soft case.
+
+-- For the Hard binding, Low_Level_Vectors exposes data types with a GCC
+-- internal translation identical to the "vector ..." C types, and a set of
+-- subprograms mapping straight to the internal GCC builtins.
+
+-- For the Soft binding, Low_Level_Vectors exposes the same set of types
+-- and subprograms, with bodies simulating the instructions behavior.
+
+-- Vector_Types/Operations "simply" bind the user types and operations to
+-- some Low_Level_Vectors implementation, selected in accordance with the
+-- target
+
+-- To achieve a complete Hard/Soft independence in the Vector_Types and
+-- Vector_Operations implementations, both versions of the low level support
+-- are expected to expose a number of facilities:
+
+-- o Private data type declarations for base vector representations embedded
+-- in the user visible vector types, that is:
+
+-- LL_VBC, LL_VUC and LL_VSC
+-- for vector_bool_char, vector_unsigned_char and vector_signed_char
+
+-- LL_VBS, LL_VUS and LL_VSS
+-- for vector_bool_short, vector_unsigned_short and vector_signed_short
+
+-- LL_VBI, LL_VUI and LL_VSI
+-- for vector_bool_int, vector_unsigned_int and vector_signed_int
+
+-- as well as:
+
+-- LL_VP for vector_pixel and LL_VF for vector_float
+
+-- o Primitive operations corresponding to the AltiVec hardware instruction
+-- names, like "vaddubm". The whole set is not described here. The actual
+-- sets are inspired from the GCC builtins which are invoked from GCC's
+-- "altivec.h".
+
+-- o An LL_Altivec convention identifier, specifying the calling convention
+-- to be used to access the aforementioned primitive operations.
+
+-- Besides:
+
+-- o Unchecked_Conversion are expected to be allowed between any pair of
+-- exposed data types, and are expected to have no effect on the value
+-- bit patterns.
+
+-------------------------
+-- Vector views layout --
+-------------------------
+
+-- Vector Views combine intuitive user level ordering for both elements
+-- within a vector and bytes within each element. They basically map to an
+-- array representation where array(i) always represents element (i), in the
+-- natural target representation. This way, a user vector (1, 2, 3, 4) is
+-- represented as:
+
+-- Increasing Addresses
+-- ------------------------------------------------------------------------->
+
+-- | 0x0 0x0 0x0 0x1 | 0x0 0x0 0x0 0x2 | 0x0 0x0 0x0 0x3 | 0x0 0x0 0x0 0x4 |
+-- | V (0), BE | V (1), BE | V (2), BE | V (3), BE |
+
+-- on a big endian target, and as:
+
+-- | 0x1 0x0 0x0 0x0 | 0x2 0x0 0x0 0x0 | 0x3 0x0 0x0 0x0 | 0x4 0x0 0x0 0x0 |
+-- | V (0), LE | V (1), LE | V (2), LE | V (3), LE |
+
+-- on a little-endian target
+
+-------------------------
+-- Vector types layout --
+-------------------------
+
+-- In the case of the hard binding, the layout of the vector type in
+-- memory is documented by the Altivec documentation. In the case of the
+-- soft binding, the simplest solution is to represent a vector as an
+-- array of components. This representation can depend on the endianness.
+-- We can consider three possibilities:
+
+-- * First component at the lowest address, components in big endian format.
+-- It is the natural way to represent an array in big endian, and it would
+-- also be the natural way to represent a quad-word integer in big endian.
+
+-- Example:
+
+-- Let V be a vector of unsigned int which value is (1, 2, 3, 4). It is
+-- represented as:
+
+-- Addresses growing
+-- ------------------------------------------------------------------------->
+-- | 0x0 0x0 0x0 0x1 | 0x0 0x0 0x0 0x2 | 0x0 0x0 0x0 0x3 | 0x0 0x0 0x0 0x4 |
+-- | V (0), BE | V (1), BE | V (2), BE | V (3), BE |
+
+-- * First component at the lowest address, components in little endian
+-- format. It is the natural way to represent an array in little endian.
+
+-- Example:
+
+-- Let V be a vector of unsigned int which value is (1, 2, 3, 4). It is
+-- represented as:
+
+-- Addresses growing
+-- ------------------------------------------------------------------------->
+-- | 0x1 0x0 0x0 0x0 | 0x2 0x0 0x0 0x0 | 0x3 0x0 0x0 0x0 | 0x4 0x0 0x0 0x0 |
+-- | V (0), LE | V (1), LE | V (2), LE | V (3), LE |
+
+-- * Last component at the lowest address, components in little endian format.
+-- It is the natural way to represent a quad-word integer in little endian.
+
+-- Example:
+
+-- Let V be a vector of unsigned int which value is (1, 2, 3, 4). It is
+-- represented as:
+
+-- Addresses growing
+-- ------------------------------------------------------------------------->
+-- | 0x4 0x0 0x0 0x0 | 0x3 0x0 0x0 0x0 | 0x2 0x0 0x0 0x0 | 0x1 0x0 0x0 0x0 |
+-- | V (3), LE | V (2), LE | V (1), LE | V (0), LE |
+
+-- There is actually a fourth case (components in big endian, first
+-- component at the lowest address), but it does not have any interesting
+-- properties: it is neither the natural way to represent a quad-word on any
+-- machine, nor the natural way to represent an array on any machine.
+
+-- Example:
+
+-- Let V be a vector of unsigned int which value is (1, 2, 3, 4). It is
+-- represented as:
+
+-- Addresses growing
+-- ------------------------------------------------------------------------->
+-- | 0x0 0x0 0x0 0x4 | 0x0 0x0 0x0 0x3 | 0x0 0x0 0x0 0x2 | 0x0 0x0 0x0 0x1 |
+-- | V (3), BE | V (2), BE | V (1), BE | V (0), BE |
+
+-- Most of the Altivec operations are specific to a component size, and
+-- can be implemented with any of these three formats. But some operations
+-- are defined by the same Altivec primitive operation for different type
+-- sizes:
+
+-- * operations doing arithmetics on a complete vector, seen as a quad-word;
+-- * operations dealing with memory.
+
+-- Operations on a complete vector:
+-- --------------------------------
+
+-- Examples:
+
+-- vec_sll/vsl : shift left on the entire vector.
+-- vec_slo/vslo: shift left on the entire vector, by octet.
+
+-- Those operations works on vectors seens as a quad-word.
+-- Let us suppose that we have a conversion operation named To_Quad_Word
+-- for converting vector types to a quad-word.
+
+-- Let A be a Altivec vector of 16 components:
+-- A = (A(0), A(1), A(2), A(3), ... , A(14), A(15))
+-- Let B be a Altivec vector of 8 components verifying:
+-- B = (A(0) |8| A(1), A(2) |8| A(3), ... , A(14) |8| A(15))
+-- Let C be a Altivec vector of 4 components verifying:
+-- C = (A(0) |8| A(1) |8| A(2) |8| A(3), ... ,
+-- A(12) |8| A(13) |8| A(14) |8| A(15))
+
+-- (definition: |8| is the concatenation operation between two bytes;
+-- i.e. 0x1 |8| 0x2 = 0x0102)
+
+-- According to [PIM - 4.2 byte ordering], we have the following property:
+-- To_Quad_Word (A) = To_Quad_Word (B) = To_Quad_Word (C)
+
+-- Let To_Type_Of_A be a conversion operation from the type of B to the
+-- type of A. The quad-word operations are only implemented by one
+-- Altivec primitive operation. That means that, if QW_Operation is a
+-- quad-word operation, we should have:
+-- QW_Operation (To_Type_of_A (B)) = QW_Operation (A)
+
+-- That is true iff:
+-- To_Quad_Word (To_Type_of_A (B)) = To_Quad_Word (A)
+
+-- As To_Quad_Word is a bijection. we have:
+-- To_Type_of_A (B) = A
+
+-- resp. any combination of A, B, C:
+-- To_Type_of_A (C) = A
+-- To_Type_of_B (A) = B
+-- To_Type_of_C (B) = C
+-- ...
+
+-- Making sure that the properties described above are verified by the
+-- conversion operations between vector types has different implications
+-- depending on the layout of the vector types:
+-- * with format 1 and 3: only a unchecked conversion is needed;
+-- * with format 2 and 4: some reorganisation is needed for conversions
+-- between vector types with different component sizes; that has a cost on the
+-- efficiency, plus the complexity of having different memory pattern for
+-- the same quad-word value, depending on the type.
+
+-- Operation dealing with memory:
+-- ------------------------------
+
+-- These operations are either load operation (vec_ld and the
+-- corresponding primitive operation: vlx) or store operation (vec_st
+-- and the corresponding primitive operation: vstx).
+
+-- According to [PIM 4.4 - vec_ld], those operations take in input
+-- either an access to a vector (e.g. a const_vector_unsigned_int_ptr)
+-- or an access to a flow of components (e.g. a const_unsigned_int_ptr),
+-- relying on the same Altivec primitive operations. That means that both
+-- should have the same representation in memory.
+
+-- For the stream, it is easier to adopt the format of the target. That
+-- means that, in memory, the components of the vector should also have the
+-- format of the target. meaning that we will prefer:
+-- * On a big endian target: format 1 or 4
+-- * On a little endian target: format 2 or 3
+
+-- Conclusion:
+-- -----------
+
+-- To take into consideration the constraint brought about by the routines
+-- operating on quad-words and the routines operating on memory, the best
+-- choice seems to be:
+
+-- * On a big endian target: format 1;
+-- * On a little endian target: format 3.
+
+-- Those layout choices are enforced by GNAT.Altivec.Low_Level_Conversions,
+-- which is the endianness-dependant unit providing conversions between
+-- vector views and vector types.
+
+----------------------
+-- Layouts summary --
+----------------------
+
+-- For a user abstract vector of 4 uints (1, 2, 3, 4), increasing
+-- addresses from left to right:
+
+-- =========================================================================
+-- BIG ENDIAN TARGET MEMORY LAYOUT for (1, 2, 3, 4)
+-- =========================================================================
+
+-- View
+-- -------------------------------------------------------------------------
+-- | 0x0 0x0 0x0 0x1 | 0x0 0x0 0x0 0x2 | 0x0 0x0 0x0 0x3 | 0x0 0x0 0x0 0x4 |
+-- | V (0), BE | V (1), BE | V (2), BE | V (3), BE |
+-- -------------------------------------------------------------------------
+
+-- Vector
+-- -------------------------------------------------------------------------
+-- | 0x0 0x0 0x0 0x1 | 0x0 0x0 0x0 0x2 | 0x0 0x0 0x0 0x3 | 0x0 0x0 0x0 0x4 |
+-- | V (0), BE | V (1), BE | V (2), BE | V (3), BE |
+-- -------------------------------------------------------------------------
+
+-- =========================================================================
+-- LITTLE ENDIAN TARGET MEMORY LAYOUT for (1, 2, 3, 4)
+-- =========================================================================
+
+-- View
+-- -------------------------------------------------------------------------
+-- | 0x1 0x0 0x0 0x0 | 0x2 0x0 0x0 0x0 | 0x3 0x0 0x0 0x0 | 0x4 0x0 0x0 0x0 |
+-- | V (0), LE | V (1), LE | V (2), LE | V (3), LE |
+
+-- Vector
+-- -------------------------------------------------------------------------
+-- | 0x4 0x0 0x0 0x0 | 0x3 0x0 0x0 0x0 | 0x2 0x0 0x0 0x0 | 0x1 0x0 0x0 0x0 |
+-- | V (3), LE | V (2), LE | V (1), LE | V (0), LE |
+-- -------------------------------------------------------------------------
+
+-- These layouts are common to both the soft and hard implementations on
+-- Altivec capable targets.