re PR target/25500 (SSE2 vectorized code is slower on 4.x.x than previous)