switch to using `half` for f16 implementation