an FP64 Multiply typically takes between 12 to 15,000. Not counting
the cost in hardware terms is just asking for trouble.
+If the number of gates gets too large it has an unintended side-effect:
+power consumption goes up but so does the distance between functions
+on-chip. A good illustration here is the CDC6600 and Cray Supercomputers
+where speed was limited by the size of the *room*. In other words larger
+functions cause communication delays, and communication delays reduce
+top speed.
+
**How long will it take to complete?**
In the case of divide or Transcendentals the algorithms needed are so