From 8ad172c480d989b191027840d7b8a7ca2cf48df0 Mon Sep 17 00:00:00 2001 From: Luke Kenneth Casson Leighton Date: Sun, 26 May 2024 15:04:08 +0100 Subject: [PATCH] https://bugs.libre-soc.org/show_bug.cgi?id=1166 add documentation of ed25519 on horizontal and vertical first algorithms covering ed25519_mul --- openpower/sv/cookbook/ed25519.mdwn | 59 ++++++++++++++++++++++++++++++ 1 file changed, 59 insertions(+) create mode 100644 openpower/sv/cookbook/ed25519.mdwn diff --git a/openpower/sv/cookbook/ed25519.mdwn b/openpower/sv/cookbook/ed25519.mdwn new file mode 100644 index 000000000..3c3cec4bc --- /dev/null +++ b/openpower/sv/cookbook/ed25519.mdwn @@ -0,0 +1,59 @@ +# Elliptic Curve ed25519 + +Links: + +* [ed25519_mul.py](https://git.libre-soc.org/?p=openpower-isa.git;a=blob_plain;f=src/openpower/decoder/isa/ed25519/curve25519_mul.py;hb=de10b86f7b3) +* [donna-ed25519](https://bugs.libre-soc.org/show_bug.cgi?id=773#c1) +* [Triangular REMAP discussion](https://lists.libre-soc.org/pipermail/libre-soc-dev/2022-May/004881.html) + +ed25519 is strategically important as its implementation was highly optimised +during its design, for high security. +[Edwards-curve Digital Signature Algorithm (EdDSA)](https://en.wikipedia.org/wiki/EdDSA#Ed25519) +was also designed to be fast. + +In the donna-ed25519 implementation, key functions such as ed25519_mul +are laid out explicitly by loop-unrolling: + +``` +t[0] = r0 * s0 +t[1] = r0 * s1 + r1 * s0; +t[2] = r0 * s2 + r1 * s1 + r2 * s0; +t[3] = r0 * s3 + r1 * s2 + r2 * s1 + r3 * s0; +t[4] = r0 * s4 + r1 * s3 + r2 * s2 + r3 * s1 + r4 * s0; +``` + +Note the very obvious patterns here which are triangular in nature. +With the very existence of Simple-V's REMAP subsystem it is quite +natural to see if triangular remapping can be added and used. +It turns out to be quite easy, and there are two possible techniques: +Vertical-First and Horizontal-First. + +With Vertical-First, the multiply is done first as a scalar item, +into a temporary register, followed by an addition of the scalar +into the actual target (t0 thru t4) + +``` +sv.mul temp, *r, *s # temporary target scalar register +sv.add *t,*t,temp # add temporary scalar onto target vector +``` + +With Horizontal-First it is extremely simple: use `madd` - +integer multiply-and-accumulate: + +``` +sv.madd *t, *r, *s +``` + +In both cases, all three target registers are set up with the +same REMAP Schedules. Additionally in both cases, t0-t4 must +be pre-initialised to zeros. + +As always with Simple-V, the power of simplicity comes primarily +from the REMAP subsystem. However in a secure environment, +reduced instruction count is also critical not just for power +consumption but to get the size of the binary down small enough +that it could fit easily into a few lines of L1 Cache. +If a huge number of loop-unrolled instructions (the normal +way of handing these algorithms) are reduced down to a +bare handful, with the looping covered in hardware, then it +is easy to understand how valuable Simple-V and REMAP is. -- 2.30.2