From 6762e153a9e4e450b1a9904f9c96ec9f9b4cbc31 Mon Sep 17 00:00:00 2001 From: Luis Machado Date: Tue, 31 Jan 2023 23:54:39 +0000 Subject: [PATCH] sme: Document SME registers and features Provide documentation for the SME feature and other information that should be useful for users that need to debug a SME-capable target. Reviewed-By: Eli Zaretskii Reviewed-by: Thiago Jung Bauermann --- gdb/NEWS | 11 ++ gdb/doc/gdb.texinfo | 252 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 263 insertions(+) diff --git a/gdb/NEWS b/gdb/NEWS index 20f53a0d542..1dc0e403797 100644 --- a/gdb/NEWS +++ b/gdb/NEWS @@ -3,6 +3,17 @@ *** Changes since GDB 13 +* GDB now supports the AArch64 Scalable Matrix Extension (SME), which includes + a new matrix register named ZA, a new thread register TPIDR2 and a new vector + length register SVG (streaming vector granule). GDB also supports tracking + ZA state across signal frames. + + Some features are still under development or are dependent on ABI specs that + are still in alpha stage. For example, manual function calls with ZA state + don't have any special handling, and tracking of SVG changes based on + DWARF information is still not implemented, but there are plans to do so in + the future. + * GDB now recognizes the NO_COLOR environment variable and disables styling according to the spec. See https://no-color.org/. Styling can be re-enabled with "set style enabled on". diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo index e1385cfb519..8ce70235dc1 100644 --- a/gdb/doc/gdb.texinfo +++ b/gdb/doc/gdb.texinfo @@ -26140,6 +26140,227 @@ but the lengths of the @code{z} and @code{p} registers will not change. This is a known limitation of @value{GDBN} and does not affect the execution of the target process. +For SVE, the following definitions are used throughout @value{GDBN}'s source +code and in this document: + +@itemize + +@item +@var{vl}: The vector length, in bytes. It defines the size of each @code{Z} +register. +@anchor{vl} +@cindex vl + +@item +@var{vq}: The number of 128 bit units in @var{vl}. This is mostly used +internally by @value{GDBN} and the Linux Kernel. +@anchor{vq} +@cindex vq + +@item +@var{vg}: The number of 64 bit units in @var{vl}. This is mostly used +internally by @value{GDBN} and the Linux Kernel. +@anchor{vg} +@cindex vg + +@end itemize + +@subsubsection AArch64 SME. +@anchor{AArch64 SME} +@cindex SME +@cindex AArch64 SME +@cindex Scalable Matrix Extension + +The Scalable Matrix Extension (@url{https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/scalable-matrix-extension-armv9-a-architecture, @acronym{SME}}) +is an AArch64 architecture extension that expands on the concept of the +Scalable Vector Extension (@url{https://developer.arm.com/documentation/101726/4-0/Learn-about-the-Scalable-Vector-Extension--SVE-/What-is-the-Scalable-Vector-Extension-, @acronym{SVE}}) +by providing a 2-dimensional register @code{ZA}, which is a square +matrix of variable size, just like SVE provides a group of vector registers of +variable size. + +Similarly to SVE, where the size of each @code{Z} register is directly related +to the vector length (@var{vl} for short), the @acronym{SME} @code{ZA} matrix +register's size is directly related to the streaming vector length +(@var{svl} for short). @xref{vl}. @xref{svl}. + +The @code{ZA} register state can be either active or inactive, if it is not in +use. + +@acronym{SME} also introduces a new execution mode called streaming +@acronym{SVE} mode (streaming mode for short). When streaming mode is +enabled, the program supports execution of @acronym{SVE2} instructions and the +@acronym{SVE} registers will have vector length @var{svl}. When streaming +mode is disabled, the SVE registers have vector length @var{vl}. + +For more information about @acronym{SME} and @acronym{SVE}, please refer to +official @url{https://developer.arm.com/documentation/ddi0487/latest, +architecture documentation}. + +The following definitions are used throughout @value{GDBN}'s source code and +in this document: + +@itemize + +@item +@var{svl}: The streaming vector length, in bytes. It defines the size of each +dimension of the 2-dimensional square @code{ZA} matrix. The total size of +@code{ZA} is therefore @var{svl} by @var{svl}. + +When streaming mode is enabled, it defines the size of the @acronym{SVE} +registers as well. +@anchor{svl} +@cindex svl + +@item +@var{svq}: The number of 128 bit units in @var{svl}, also known as streaming +vector granule. This is mostly used internally by @value{GDBN} and the Linux +Kernel. +@anchor{svq} +@cindex svq + +@item +@var{svg}: The number of 64 bit units in @var{svl}. This is mostly used +internally by @value{GDBN} and the Linux Kernel. +@anchor{svg} +@cindex svg + +@end itemize + +When @value{GDBN} is debugging the AArch64 architecture, if the Scalable Matrix +Extension (@acronym{SME}) is present, then @value{GDBN} will make the @code{ZA} +register available. @value{GDBN} will also make the @code{SVG} register and +@code{SVCR} pseudo-register available. + +The @code{ZA} register is a 2-dimensional square @var{svl} by @var{svl} +matrix of bytes. To simplify the representation and access to the @code{ZA} +register in @value{GDBN}, it is defined as a vector of +@var{svl}x@var{svl} bytes. + +If the user wants to index the @code{ZA} register as a matrix, it is possible +to reference @code{ZA} as @code{ZA[@var{i}][@var{j}]}, where @var{i} is the +row number and @var{j} is the column number. + +The @code{SVG} register always contains the streaming vector granule +(@var{svg}) for the current thread. From the value of register @code{SVG} we +can easily derive the @var{svl} value. + +@anchor{aarch64 sme svcr} +The @code{SVCR} pseudo-register (streaming vector control register) is a status +register that holds two state bits: @sc{sm} in bit 0 and @sc{za} in bit 1. + +If the @sc{sm} bit is 1, it means the current thread is in streaming +mode, and the @acronym{SVE} registers will use @var{svl} for their sizes. If +the @sc{sm} bit is 0, the current thread is not in streaming mode, and the +@acronym{SVE} registers will use @var{vl} for their sizes. @xref{vl}. + +If the @sc{za} bit is 1, it means the @code{ZA} register is being used and +has meaningful contents. If the @sc{za} bit is 0, the @code{ZA} register is +unavailable and its contents are undefined. + +For convenience and simplicity, if the @sc{za} bit is 0, the @code{ZA} +register and all of its pseudo-registers will read as zero. + +If @var{svl} changes during the execution of a program, then the @code{ZA} +register size and the bits in the @code{SVCR} pseudo-register will be updated +to reflect it. + +It is possible for users to change @var{svl} during the execution of a +program by modifying the @code{SVG} register value. + +Whenever the @code{SVG} register is modified with a new value, the +following will be observed: + +@itemize + +@item The @sc{za} and @sc{sm} bits will be cleared in the @code{SVCR} +pseudo-register. + +@item The @code{ZA} register will have a new size and its state will be +cleared, forcing its contents and the contents of all of its pseudo-registers +back to zero. + +@item If the @sc{sm} bit was 1, the @acronym{SVE} registers will be reset to +having their sizes based on @var{vl} as opposed to @var{svl}. If the +@sc{sm} bit was 0 prior to modifying the @code{SVG} register, there will be no +observable effect on the @acronym{SVE} registers. + +@end itemize + +The possible values for the @code{SVG} register are 2, 4, 8, 16, 32. These +numbers correspond to streaming vector length (@var{svl}) values of 16 +bytes, 32 bytes, 64 bytes, 128 bytes and 256 bytes respectively. + +The minimum size of the @code{ZA} register is 16 x 16 (256) bytes, and the +maximum size is 256 x 256 (65536) bytes. In streaming mode, with bit @sc{sm} +set, the size of the @code{ZA} register is the size of all the SVE @code{Z} +registers combined. + +The @code{ZA} register can also be accessed using tiles and tile slices. + +Tile pseudo-registers are square, 2-dimensional sub-arrays of elements within +the @code{ZA} register. + +The tile pseudo-registers have the following naming pattern: +@code{ZA<@var{tile number}><@var{qualifier}>}. + +There is a total of 31 @code{ZA} tile pseudo-registers. They are +@code{ZA0B}, @code{ZA0H} through @code{ZA1H}, @code{ZA0S} through @code{ZA3S}, +@code{ZA0D} through @code{ZA7D} and @code{ZA0Q} through @code{ZA15Q}. + +Tile slice pseudo-registers are vectors of horizontally or vertically +contiguous elements within the @code{ZA} register. + +The tile slice pseudo-registers have the following naming pattern: +@code{ZA<@var{tile number}><@var{direction}><@var{qualifier}> +<@var{slice number}>}. + +There are up to 16 tiles (0 ~ 15), the direction can be either @code{v} +(vertical) or @code{h} (horizontal), the qualifiers can be @code{b} (byte), +@code{h} (halfword), @code{s} (word), @code{d} (doubleword) and @code{q} +(quadword) and there are up to 256 slices (0 ~ 255) depending on the value +of @var{svl}. The number of slices is the same as the value of @var{svl}. + +The number of available tile slice pseudo-registers can be large. For a +minimum @var{svl} of 16 bytes, there are 5 (number of qualifiers) x +2 (number of directions) x 16 (@var{svl}) pseudo-registers. For the +maximum @var{svl} of 256 bytes, there are 5 x 2 x 256 pseudo-registers. + +When listing all the available registers, users will see the +currently-available @code{ZA} pseudo-registers. Pseudo-registers that don't +exist for a given @var{svl} value will not be displayed. + +For more information on @acronym{SME} and its terminology, please refer to the +@url{https://developer.arm.com/documentation/ddi0616/aa/, +Arm Architecture Reference Manual Supplement}, The Scalable Matrix Extension +(@acronym{SME}), for Armv9-A. + +Some features are still under development and rely on +@url{https://github.com/ARM-software/acle/releases/latest, ACLE} and +@url{https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst, ABI} +definitions, so there are known limitations to the current @acronym{SME} +support in @value{GDBN}. + +One such example is calling functions in the program being debugged by +@value{GDBN}. Such calls are not @acronym{SME}-aware and thus don't take into +account the @code{SVCR} pseudo-register bits nor the @code{ZA} register +contents. @xref{Calling}. + +The @url{https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#the-za-lazy-saving-scheme, +lazy saving scheme} involving the @code{TPIDR2} register is not yet supported +by @value{GDBN}, though the @code{TPIDR2} register is known and supported +by @value{GDBN}. + +Lastly, an important limitation for @command{gdbserver} is its inability to +communicate @var{svl} changes to @value{GDBN}. This means @command{gdbserver}, +even though it is capable of adjusting its internal caches to reflect a change +in the value of @var{svl} mid-execution, will operate with a potentially +different @var{svl} value compared to @value{GDBN}. This can lead to +@value{GDBN} showing incorrect values for the @code{ZA} register and +incorrect values for SVE registers (when in streaming mode). + +This is the same limitation we have for the @acronym{SVE} registers, and there +are plans to address this limitation going forward. + @subsubsection AArch64 Pointer Authentication. @cindex AArch64 Pointer Authentication. @anchor{AArch64 PAC} @@ -48380,6 +48601,37 @@ This restriction may be lifted in the future. Extra registers are allowed in this feature, but they will not affect @value{GDBN}. +@subsubsection AArch64 SME registers feature + +The @samp{org.gnu.gdb.aarch64.sme} feature is optional. If present, +it should contain registers @code{ZA}, @code{SVG} and @code{SVCR}. +@xref{AArch64 SME}. + +@itemize @minus + +@item +@code{ZA} is a register represented by a vector of @var{svl}x@var{svl} +bytes. @xref{svl}. + +@item +@code{SVG} is a 64-bit register containing the value of @var{svg}. @xref{svg}. + +@item +@code{SVCR} is a 64-bit status pseudo-register with two valid bits. Bit 0 +(@sc{sm}) shows whether the streaming @acronym{SVE} mode is enabled or disabled. +Bit 1 (@sc{ZA}) shows whether the @code{ZA} register state is active (in use) or +not. +@xref{aarch64 sme svcr}. + +The rest of the unused bits of the @code{SVCR} pseudo-register is undefined +and reserved. Such bits should not be used and may be defined by future +extensions of the architecture. + +@end itemize + +Extra registers are allowed in this feature, but they will not affect +@value{GDBN}. + @node ARC Features @subsection ARC Features @cindex target descriptions, ARC Features -- 2.30.2