rewrite additional matrix-related functions to reduce register needs