<https://en.wikipedia.org/wiki/Fast_Fourier_transform#Applications>
ARM has already added `vqrdmulhq_s16/32` instructions as their inclusion
in any ISA replaces **eight** non-Twin-Butterfly instructions, which
-are often loop-unrolled, resulting in L1 I-Cache stripmining.
+are often loop-unrolled, resulting in L1 I-Cache stripmining as well
+as requiring far greater resources or much more complex hardware to
+get efficient execution.
**Notes and Observations**: