compiler: generate memmove for non-pointer slice copy
The builtin copy function is lowered to runtime functions
slicecopy, stringslicecopy, or typedslicecopy. The first two are
basically thin wrappers of memmove. Instead of making a runtime
call, we can just use __builtin_memmove. This gives the compiler
backend opportunities for further optimizations.
Move the lowering of builtin copy function to flatten phase for
the ease of rewriting.
Also do this optimization for the copy part of append(s1, s2...).
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/170005
From-SVN: r271017